Python Programming, news on the Voidspace Python Projects and all things techie.

Raising Arbitrary Objects as Exceptions (a Hack!)

emoticon:test_tubes In Python you can't raise arbitrary objects as exceptions with the raise statement, for obvious reasons.

From this introduction you should be able to work out what this code does:

class MyException(Exception):
   def __new__(cls):
       return None

   raise MyException
except MyException, e:
   print e

That's right, the except statement catches None as an exception. This obscure code path was discovered by Dino Viehland as he had to implement it for IronPython!

What happens is that CPython has an optimization for raising exception classes rather than instances. It only instantiates if you actually catch the exception object; for many except blocks you only need to know the type and the exception never need be instantiated. If you override __new__ then you can return any arbitrary object at instantiation (including an instance of another exception type if you want) - which will be caught as MyException.

It is not recommended that you do this in production code...

In a similar vein, can you guess what exception this statement raises:

raise UnicodeDecodeError

That's right, a TypeError...

Like this post? Digg it or it.

Posted by Fuzzyman on 2008-03-21 23:30:11 | |

Categories: , Tags:

Ironclad Overview

emoticon:scanner Ironclad is an MIT licensed Open Source by Resolver Systems. The goal of Ironclad is to allow you to seamlessly import C extension modules (for CPython) in IronPython.

The short version of how it works, is that Ironclad creates a fake Python dll, with the C-API pointing to functions mainly implemented in C#. These create and manipulate IronPython objects.

Resolver Systems has had one pair programming on the project (led by one of our core developers, William Reade) for several months. The current status of the project is that, for simple modules, module initialisation works. In the 0.1 binary release you can call functions. The version in the SVN repository allows you to use classes defined in extensions (so long as they only call the parts of the C-API that we have implemented). The file type kind-of works - you can open and close files but not yet read or write from them. Smile


Code that runs on the .NET VM is called 'managed code'. It is managed by the virtual machine which provides garbage collection, security features and type verification etc. You can still call into code written in C (native code), but although this runs in the same process space it isn't running inside the .NET VM - so it is called 'unmanaged code'.

Here is a longer (but still brief) overview of Ironclad that I presented at the Python and .NET open space at PyCon.

We generate a stub-dll that looks superficially like Python25.dll. Prior to importing a CPython extension we load this dll into our process space and initialise it by passing in a pair of function pointers (that point to managed code). The dll initialisation function calls these repeatedly passing in symbol names (CPython's exported symbols - the CPython API). This creates the CPython API but with these functions pointing to managed code (delegates) that actually provides the implementation (in C#).

You can now load C extension modules ('.pyd') by creating a 'PydImporter' and calling its load method with the path to the binary on the filesystem. This loads the binary into the process space and calls the module initialisation function (e.g. initbz2 or initmultiarray). This initialisation function will call the C-API that we have already initialised.

Generally this will start off by calling 'Py_InitModule4'. One of the parameters to this function is a pointer to an array of PyMethodDef structs which defines the functions exported by the module we are loading. We generate IronPython code corresponding to these functions and execute it in a new module - to expose these functions to IronPython. These functions call delegates which point to the corresponding unmanaged code (the module!). We do a similar thing for the classes, which are added by calls to 'PyModule_AddObject'.

Almost all of the hard work is done by one monster C# class that actually implements the Python C API (or the parts of it that we have got to) - the 'Python25Mapper' (which inherits from 'PythonMapper' which is autogenerated).

When we call a function from IronPython - we call the mapper's 'Store' method on both the args tuple and the kwargs dictionary - this creates pointers representing the original items, which are then passed into the delegate. Marshally magic happens and the unmanaged function is called with pointers to args (tuple) and kwargs (dict).

(Inevitably the first thing that a function does is call PyArg_ParseTuple[AndKeywords]. This was so complicated that we actually just lifted the CPython implementation rather than re-implementing in C#.)

Eventually it will return something. This something will usually have been created through the C-API (so we have control over what was created) - so the PythonMapper has a reference to it, mapping a pointer to a managed object. When this pointer is returned (from the delegate) we check for errors set on the PythonMapper, raising the exception if necessary. If not we can 'Retrieve' the managed object and return it from the function.

An interesting point is that we have to handle reference counting. When a object is stored for the first time, the mapper allocates some memory following the layout of the normal 'PyObject' and sets the refcount to 1. Subsequent calls to 'Store' for the same object will increment the ref-counter and return the same pointer. When the refcount drops to 0 (as a result of managed or unmanaged code) this memory can be deallocated because we know that the unmanaged code has no references to.

Strings and tuples we need to handle slightly specially because C extensions need direct access to memory to some of their items. We do this by converting these objects on the way in (and on the way out if necessary). (Lists will need to be handled very specially at some point - but not yet!!)

In a call to a module function, once we have retrieved the result from the result pointer it is safe for us to DecRef the args and kwargs and result pointers.

Types are complicated, but fundamentally the same!

BZ2 is our test case so far. compress and decompress works. The BZ2Compressor and BZ2Decompressor types both work. BZ2File (passing file stream descriptors from managed to unmanaged code) is proving fun... (You can open and close the files, just not read or write anything yet.) Numpy next...

We haven't done much with the GIL yet!

The stub-dll actually uses assembly language to write a table of function pointers. Most of these functions are not implemented in C (except a few like PyArg_ParseTuple[AndKeywords]) but in C#. The assembly is a chunk of code with a series labels (function name) followed by a jump instruction (using a pointer to a managed delegate which calls into our C#). We do this in assembly language because gcc won't let us define C functions without a calling convention (it won't let us use __declspec(naked)).

So, lots of magic and lots of fun difficulties, but it is working so far! The binaries are only suitable for Windows and we haven't yet tried this project with Mono. William deliberately picked gcc as the compiler so that it could be ported to Mono though. There is a longer, more detailed, version of this overview in code repository.

The business value for Resolver Systems in this project is that our customers will be able to use Scipy in Resolver One, our IronPython spreadsheet.

There are two interesting potential 're-uses' of Ironclad.

  • Some of the PyPy folk are interested in looking to see how much of the project they could reuse to allow you to use C-Extensions from PyPy.
  • By rewriting parts of the top layer (or even just embedding IronPython) it could allow you to use Python C-extensions from any .NET language.

Like this post? Digg it or it.

Posted by Fuzzyman on 2008-03-21 20:03:07 | |

Categories: , , , , , Tags: ,

Silverlight 2 Articles and Interpreter in the Browser Coming Soon

emoticon:bluetooth Sorry, not the shortest title in the world. I've just arrived home from PyCon with a world class case of jetlag and a blogging backlog.

I've nearly completed turning my talk on Silverlight 2 into a series of articles. I aim to post them later today.

On the day of my talk, Dino Viehland (IronPython developer) gave me some updated binaries allowing me to demo a prototype 'Interactive Interpreter in the Browser'. It wasn't much work to turn the prototype into an interactive interpreter running in an HTML textarea (better looking than the one I demoed):

Interactive Interpreter in the Browser

So far it runs on Safari (and probably Firefox but currently Firefox won't connect to localhost for me to test it), but the Javascript probably needs some attention for it to work on IE.

It behaves like the Python interactive interpreter - so you can only type (and delete) on the last line after the prompt. This magic is done with a Javascript 'onkeydown' handler that calls into Silverlight with the current cursor position. It cancels text edits except on the last line after the prompt. It also detects newlines (that IE sends as '\r' would you believe) and executes the current code in the interpreter (using the Python standard library code module).

It doesn't detect Ctrl-C and would need to run inside a thread to handle them anyway. I can't release this until the bugfixed IronPython binaries are released on, but Dino is working on it. Smile

This will be great way of embedding a Python interpreter (that runs on the browser) into documentation and tutorials.

Like this post? Digg it or it.

Posted by Fuzzyman on 2008-03-21 15:30:44 | |

Categories: , , Tags: ,

Hosted by Webfaction