Dark Corners of IronPython

Dynamic Languages on .NET

IronPython in Action

Note

This is the fourth article in a series on developing with IronPython. The other articles are:

A much more complete introduction to Python and IronPython is my book: IronPython in Action.

Introduction

This article contains a lot of the nitty gritty details of how to use Python the language to integrate with the underlying .NET framework. The .NET framework was designed to be used from statically typed languages, like C# and VB.NET, and so not all the features map easily to concepts in a dynamically typed language like Python.

The IronPython team have done a very good job of integrating Python and the .NET framework without having to change Python or introduce new syntax. Despite this there are various areas where you need to know some IronPython specific information; previous experience of Python or C# alone will not be enough.

This article also looks at some features of IronPython that are new in IronPython 2.6.

Subclassing .NET Types

In IronPython you can subclass .NET types. The typical example I use is for creating the view of a Windows Forms application; subclassing Form and doing the initialization of the UI in the __init__ method:

import clr
clr.AddReference('System.Windows.Forms')

from System.Windows.Forms import (
    Application, Form
)


class MainForm(Form):

    def __init__(self, text):
        self.Text = text

app = MainForm('Hello World')
Application.Run(app)

The __init__ method is the Python initializer method. It is an instance method and so receives the instance as the first argument, which by convention is called self. If you want to override the .NET constructor (responsible for creating the instance) then from Python you override the __new__ class method.

For example:

import clr
clr.AddReference('System.Windows.Forms')

from System.Windows.Forms import (
    Application, Form
)


class MainForm(Form):

    def __new__(cls, text):
        instance = Form.__new__(cls)
        instance.Text = text

        return instance

app = MainForm('Hello World')
Application.Run(app)

As __new__ is a class method it receives the class as the first argument rather than an instance. It creates the instance by calling up to the class constructor (Form.__new__(cls)) and is then free to configure the instance before returning it. This is another key difference between __new__ and __init__. __new__ is responsible for returning the instance it creates (and in fact could even return an instance of a different class if you wanted that). To return anything other than None from an __init__ method is an error.

Keyword Arguments to Constructors

Prior to version 4.0 C# doesn't have a native concept of keyword arguments. IronPython takes advantages of this to allow you to pass keyword arguments into constructor to set properties.

Note

VB.NET has had named parameters since version 1.0. Unfortunately the default values are compiled into the caller, so if the default value is changed you have to recompile everything that calls the method with the named parameter. For this reason they aren't very useful and their use is discouraged.

The example code above has a Form subclass that takes a text argument and the __init__ method uses it to set the Text property on the form. We could achieve the same thing using a keyword argument instead of subclassing:

form = Form(Text="The Form Title")

As .NET classes are frequently configured by setting properties rather than passing arguments in, this pattern can make code simpler and more elegant if you are only setting a few properties.

Adding References

The clr module allows you to dynamically add references to .NET assemblies in order to be able to import classes from the namespaces they contain. This is usually your first point of interoperation between Python code and the .NET framework.

The clr module has several different functions for adding references to assemblies, and each of them have slightly different semantics and use cases.

clr.AddReference(str) will first try Assembly.Load and then try Assembly.LoadWithPartialName. This should mean that trying with the strong name won't do anything because LoadWithPartialName will do that for you. This is kind of the one-stop shopping for loading assemblies in the GAC or the app base. If those don't work it will also search sys.path when the load fails.

clr.AddReference(asm), where asm is a System.Reflection.Assembly object, will just add the assembly that you've already loaded. This is useful if you can load the assembly using some other mechanism, for example with Assembly.LoadFrom or loading from a byte array.

clr.AddReferenceToFile(str) will search sys.path for the file and load it from there using Assembly.LoadFile. This is good to not have the normal Assembly.Load/LoadWithPartialName mechanisms get in the way.

clr.AddReferenceToFileAndPath(str) will take a fully qualified path, append it to sys.path, and then do Assembly.LoadFile with the path. This is just convenience for quickly adding a fully qualified assembly.

There are also Load* variations which return the assembly object which you can dot through without altering what you can import.

A useful tool for debugging assembly load failures is fuslogvw. This will tell you where IronPython is looking and the exact cause for failure.

The clr module also has a useful member: clr.References. This is a list (not a Python list but a special container type called a ReferencesList) containing all the assemblies (Assembly objects) the current Python engine has a reference to.

Method Overloads

Python doesn't natively have method overloading, instead you can do dynamic type dispatch at runtime if you need it. As for .NET overloaded methods, most of the time IronPython does the right for you without you having to worry about it. IronPython will look at the types you are passing and call the right overload for you. If no overloads match you'll get a runtime error.

Occasionally there will be ambiguities where IronPython either can't work out which overload you want (None will match any reference type for example) or IronPython call's the wrong one. If you want to explicitly specify which overload by accessing the .Overloads member of methods and indexing with a type, or tuple of types, that specify which overload you want.

For example:

from System import Console

Console.WriteLine.Overloads[object](None)

Arrays and Generics

One of the areas of the .NET framework that doesn't easily map to Python are containers that require you to specify the type of their members; i.e. arrays and generic containers. For your own Python code you'll undoubtably find the heterogenous Python containers simpler to work with. However elegant a system generics are they add a layer of complexity simply not needed in dynamic languages.

The C# syntax, which I'm sure you're all familiar with, looks like this:

List<String> dinosaurs = new List<String>();

This of course is invalid syntax in Python (or at least the angle brackets have a very different meaning). The IronPython team found a clever workaround by reusing the indexing syntax; you index the container with the type. To create an array of integers:

>>> from System import Array
>>> array = Array[int]((1, 2, 3, 4, 5))
>>> array
Array[int]((1, 2, 3, 4, 5))

Value Types

.NET has both reference types and value types, where objects like numbers, structs, booleans, bytes, dates (and so on) are all value types. Value types are allocated on the stack, or inline within an object, rather than allocated on the heap. They are also passed by value rather than by reference, meaning that where you pass them as arguments in a function or assign them to variables they are copied rather than references created.

Python, at least CPython, doesn't draw a distinction between value and reference types. All names are references to objects, effectively reference types. A Python programmer will expect all objects to behave as reference types. This rarely causes problems, but the following behavior would surprise a Python programmer:

>>> from System import Array
>>> from System.Drawing import Point
>>> point = Point(0, 0)
>>> array = Array[Point]((point,))
>>> array[0].X = 30
>>> array[0].X
0

The same thing happens in C#. Point is a struct which is a value type. When you access the first element of the array the struct is copied and the X co-ordinate is updated on the copy. When it is fetched a second time a new copy is fetched, with the original value rather than the value that was set.

Delegates and CallTarget0

As we saw with Windows Forms IronPython will usually cast Python functions to delegates where needed. This doesn't always work and we need to do it ourself. IronPython provides a useful delegate for us to wrap functions in: CallTarget0. One place we need to use this is when invoking onto a Windows Forms or WPF UI thread from a background thread:

import clr
clr.AddReference('IronPython')
from IronPython.Compiler import CallTarget0

delegate = CallTarget0(lambda: UpdateProgressBar(30))
form.Invoke(delegate)

System.Func and System.Action

If you are using .NET 3.5 or more recent then there are builtin delegates available to do the same job as CallTarget0. These are Func (used for functions that return a value) and Action (used for .NET methods that have a void signature).

These delegates live in the System namespace but you must add a reference to the System.Core assembly before you can import them. If you need compatibility with .NET 2 it is easier to stick with CallTarget0.

import clr
clr.AddReference('System.Core')
from System import Action, Func

function = Func[object, object](targetCallableHere)
action = Action[object](anotherCallable)

out and ref parameters

Where you need to return multiple values in Python you typically do it by returning a tuple. In C# you would normally use an out parameter. A method that takes an out parameter effectively modifies the parameter in the scope calling the method. This doesn't fit with the way Python treats variables and would modify Python semantics if implemented in the same way as C#. Instead out parameters are simply treated as an extra return value.

From C#:

String value;
Bool success;

success = SomeMethod(out value);

From IronPython:

success, value = self.SomeMethod()

Similar to out parameters are ref parameters. These need to be initialised with a value, so they can't just be returned as additional values. Instead we create a reference using clr.Reference.

From C#:

int value = 6;

SomeMethod(ref value);

From IronPython:

import clr

reference = clr.Reference(6)
self.SomeMethod(reference)

updated_value = reference.Value

Implicit conversion

Some datatypes allow implicit conversion to another type as a way of reducing the need for casts. You allow this in your own types, in C#, by defining the implicit operator:

using System;
struct Digit
{
   byte value;

   public Digit(byte value)
   {
      if (value > 9) throw new ArgumentException();
      this.value = value;
   }

   // define implicit Digit-to-byte conversion operator:
   public static implicit operator byte(Digit d)
   {
      Console.WriteLine( "conversion occurred" );
      return d.value;
   }
}

In C# you can just declare a Digit to be a byte and the conversion happens, no need for a cast:

Digit d = new Digit(3);

// implicit (no cast) conversion from Digit to byte
byte b = d;

In IronPython we have no type declarations so we can't trigger the implicit conversion. Instead we can use clr.Convert to do the conversion:

>>> import clr
>>> clr.AddReference('Digit')
>>> from Digit import Digit
>>> from System import Byte
>>>
>>> d = Digit(2)
>>> clr.Convert(d, Byte)
conversion occurred
<System.Byte object at 0x0...2C [2]>

clr.Convert takes an object and the type you want to convert it to. It is new in IronPython 2.6.

The IronPython Equivalent of typeof

If you're converting code from C# to IronPython you may come across example code that uses the C# typeof operator. This is effectively a compiler directive (bound at compile time) rather than causing anything to happen at runtime.

For example, if you were using the Reflection.Emit APIs to emit IL you might need to translate the following line from C# to IronPython:

generator.Emit(OpCodes.Newarr, typeof(string));

In Python types are first class objects, but if we try and use the .NET types directly then .NET code will actually see an IronPython reflected type rather than the System.Type object it expects. There are actually several ways of performing an operation equivalent to typeof in IronPython, but a convenient way is to use clr.GetClrType passing in the type object.

Here's an example using the Python type str, which of course is System.String (and either would do - they are different ways of spelling the same thing in IronPython):

>>> import clr
>>> clr.GetClrType(str)
<System.RuntimeType object at 0x0...02B [System.String]>

Interfaces

Interfaces are a way of specifying that certain types have known behavior. An interface defines functionality, and the methods and properties for that functionality.

Where a .NET class implements an interface, that interface will be added to the method resolution order when you use it from Python, effectively making the interface behave like a base class — coincidentally you can also implement an interface in a Python class by inheriting from it.

Even if an explicitly implemented interface method is marked as private on a class, you’ll still be able to call it.

As well as calling the method directly, you can call the method on the interface—passing in the instance as the first argument in the same way you pass the instance (self) as the first argument when calling an unbound method on a class.

Note

You can also call methods on the base classes of .NET classes, by passing in the instance as the first argument. This can also be useful when a method obscures a method on a base class making it impossible to call directly from the instance.

The following snippet shows an example of calling BeginInit on a DataGridView, through the ISupportInitialize interface that it implements:

>>> import clr
>>> clr.AddReference('System.Windows.Forms')
>>> from System.Windows.Forms import DataGridView
>>> from System.ComponentModel import ISupportInitialize
>>> grid = DataGridView()
>>> ISupportInitialize.BeginInit(grid)

How is this useful? Well, it’s possible for a class to implement two interfaces with conflicting member names. This technique allows you to call a specific interface method.

This workaround is fine for methods, but you don’t normally pass in arguments to invoke properties. Instead, you can use GetValue and SetValue on the interface property descriptor.

ISomeInterface.PropertyName.GetValue(instance)
ISomeInterface.PropertyName.SetValue(instance, value)

Note

GetValue and SetValue aren’t available only for property descriptors; they’re also available for instance fields if you access them directly on the class in IronPython.

Events and Memory Leaks

Something to be aware of when working with IronPython is how it stores event handlers. When you hook up a Python function or method as an event handler, IronPython creates a delegate which becomes the real event handler. So that you can later unhook the event with the same function IronPython keeps a reference to the original function and the delegate it has created. Any object that this function keeps alive through its closure (its local variables and everything they can reach) will be ineligible for garbage collection until the event is unhooked.

If the event handler function is a method then its self parameter will keep the parent object, and anything it references, alive.

This can be a cause of memory leaks. If your application creates and destroys GUI components, for example, then merely disposing of the GUI elements may not be enough to free the memory associated with them. IronPython will still have references to the event handler even if it is a method on a subclass of a GUI component that has been disposed.

This problem has got less in recent versions of IronPython. IronPython always tries to keep weak references where possible (but if you use lambda functions this reference maybe the only thing keeping the lambda alive) but our experiences at Resolver Systems indicate that it can still be a problem. Unfortunately memory leaks can be hard to diagnose (as with many problems the first step is to recognise that you have a problem...).

The solution is to unhook the events when you dispose of the object the events are on. An article on debugging memory leaks in IronPython by Kamil Dworakowski is a good reference on using Windbg and the SOS extensions to track down memory leaks in IronPython.

Static compilation of Python code

Python is a dynamic language. You can (if you really want to) do very odd things like dynamically adding new members to classes or even changing the inheritance hierarchy at runtime. You can also switch out the class of an instance at runtime by assigning to it's __class__ property. These are not things you can do with C# classes, except perhaps in a limited way with extension methods (which are really static methods living somewhere else that the compiler pretends are new instance methods).

In order to implement the Python semantics IronPython classes are not true .NET classes but are instances of PythonType. This has the added benefit of allowing the .NET garbage collector to collect Python classes, which it can't do with .NET classes.

A big downside to this is that you can't create Python classes in IronPython and then reference and instantiate them from C#. Instead interacting with IronPython, including getting classes and creating instances, has to be done through the DLR hosting API.

Something we can do however, is compile Python modules into assemblies that we can use from IronPython. This saves the initial compile time and you can also pre-JIT with ngen. A useful side-effect is that you can deploy binaries instead of source code.

Note

With IronPython 2.6 Python scripts are initially run in interpreted mode. This is because the overhead to compile code then run it is slower than running it in interpreted mode. If functions are called repeatedly then 'adaptive compilation' kicks in and they are compiled in the background.

The clr module includes a function that can save executable assemblies: CompileModules. Its call signature is as follows:

clr.CompileModules(assemblyName, *filenames, **kwArgs)

This function compiles Python files into assemblies that you can add references to and import from in the normal way with IronPython. This feature allows you to package several Python modules as a single assembly. The following snippet of code takes two modules and outputs a single assembly:

import clr
clr.CompileModules("modules.dll", "module.py", "module2.py")

Having created this assembly, you can add a reference to it and then import the module and module2 namespaces from it.

import clr
clr.AddReference('modules.dll')
import module, module2

CompileModules also takes a keyword argument, mainModule, that allows you to specify the Python file that acts as the entry point for your application. This still outputs to a dll rather than an exe file, but it can be combined with a stub executable to create binary distributions of Python applications. Creating a stub executable can be automated with some fiddly use of the .NET Reflection.Emit API, but it’s far simpler to use the Pyc sample that comes with IronPython and does all the hard work for you. Pyc is a command-line tool that creates console or Windows executables from Python source files. It comes with good documentation, plus a command-line help switch. The basic usage pattern is as follows:

ipy.exe pyc.py /out:program.exe /main:main.py /target:winexe module.py module2.py

CompileSubclassTypes and GetSubclassedTypes

The clr module exposes another, related, API for compiling types to disk (new in IronPython 2.6). Again it doesn't allow you to use the assemblies it generates from C# but it has its uses.

When you subclass a .NET type IronPython will generate a real .NET type. This is essential for using these subclasses with .NET APIs, instances of your Python subclasses need to be genuine instances of the .NET base-type. One type per .NET base-type is shared between all Python subclasses of the same type and lives in an in-memory AssemblyBuilder instance. Some APIs can't cope with in-memory types like this.

One example is the Gtk# user interface library that is part of Mono but also works on .NET. This is because it uses the Location of types which in .NET throws a NotImplementedException for in memory types (it works on Mono because it makes up a value for the Location). The workaround is to use CompileSubclassTypes and dump the subclasses to disk. If you reference the generated assemblies then the subclass will no longer be only in memory.

import clr
clr.AddReference('gtk-sharp')
import Gtk

class MyWindow(Gtk.Window):
    pass

clr.CompileSubclassTypes('mytypes', *clr.GetSubclassedTypes())

Once you have generated the assembly you use clr.AddReference to add a reference to the generated assembly (and remove the call to CompileSubclassTypes). This has the added benefit of improving startup time as IronPython no longer has to dynamically generate the subclasses.

.NET Attributes and the __clrtype__ metaclass

As IronPython types aren't true .NET types we can't apply .NET attributes to them. This is the biggest hole in the IronPython .NET integration. Python classes also can't be used in some types of .NET data-binding which also expect true .NET types.

IronPython 2.6 introduces a solution to these problems in the form of the __clrtype__ metaclass. It is called at class creation time (which in Python happens at runtime) and allows you to customize the .NET type that backs a Python class, allowing you to apply attributes. The __clrtype__ hook is a very low-level API that requires you to generate IL to create the .NET class. Fortunately Harry Pierson, the IronPython program manager, has written a tutorial series on __clrtype__ including building some higher level APIs on top of it.

Hopefully some of these will be included in the final release of IronPython 2.6. Here's an example of using his CustomAttributeBuilder to apply attributes to a type:

clr.AddReference("System.Xml")
from System.Xml.Serialization import XmlRootAttribute
from System import ObsoleteAttribute, CLSCompliantAttribute

def make_cab(attrib_type, *args):
    argtypes = tuple(map(lambda x:clr.GetClrType(type(x)), args))
    ci = clr.GetClrType(attrib_type).GetConstructor(argtypes)
    return CustomAttributeBuilder(ci, args)

def cab_builder(attrib_type):
    return lambda *args:make_cab(attrib_type, *args)

Obsolete = cab_builder(ObsoleteAttribute)
CLSCompliant = cab_builder(CLSCompliantAttribute)
XmlRoot = cab_builder(XmlRootAttribute)

class Product(object):
    __metaclass__ = ClrTypeMetaclass
    _clrnamespace = "DevHawk.IronPython.ClrTypeSeries"
    _clrclassattribs = [
        Obsolete("Warning Lark's Vomit"),
        CLSCompliant(False),
        XmlRoot("product", Namespace="http://samples.devhawk.net")
    ]

Further APIs will be built on top of __clrtype__ to simplify working with attributes.

What Next?

There are plenty of other places to turn for more information on IronPython. Here are a few suggestions:

For buying techie books, science fiction, computer hardware or the latest gadgets: visit The Voidspace Amazon Store.

Hosted by Webfaction

Return to Top

Page rendered with rest2web the Site Builder

Last edited Fri Nov 27 18:32:35 2009.

Counter...