Python Programming, news on the Voidspace Python Projects and all things techie.

Tests that fail one day every four years

emoticon:car Some code looks harmless but has hidden bugs lurking in its nether regions. Code that handles dates is notorious for this, and this being February 29th (the coders' halloween) it's time for the bugs to come crawling out of the woodwork.

Some of our tests were constructing expected dates, one year from today, with the following code:

from datetime import datetime

now = datetime.utcnow()
then = datetime(now.year + 1, now.month, now.day)

Of course if you run this code today, then it tries to construct a datetime for February 29th 2013, which fails because that date doesn't exist.

When I posted this on twitter a few people suggested that instead we should have used timedelta(days=365) instead. Again, this works most of the time - but if you want a date exactly one year from now it will fail in leap years when used before February 29th:

>>> from datetime import datetime, timedelta
>>> datetime(2012, 2, 27) + timedelta(days=365)
datetime.datetime(2013, 2, 26, 0, 0)

The correct fix is to use the wonderful dateutil module, in particular the dateutil.relativedelta.relativedelta:

>>> from datetime import datetime
>>> from dateutil.relativedelta import relativedelta
>>> datetime.utcnow() + relativedelta(years=1)
datetime.datetime(2013, 2, 28, 15, 20, 21, 546755)
>>> datetime(2012, 2, 27) + relativedelta(years=1)
datetime.datetime(2013, 2, 27, 0, 0)

And as another hint, always use datetime.utcnow() instead of datetime.now() to avoid horrible timezone nightmares (exactly which timezone are your servers in?).

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2012-02-29 15:24:39 | |

Categories: , Tags: ,


Callable object with state using generators

emoticon:beaker It's often convenient to create callable objects that maintain some kind of state. In Python we can do this with objects that implement the __call__ method and store the state as instance attributes. Here's the canonical example with a counter:

>>> class Counter(object):
...     def __init__(self, start):
...         self.value = start
...     def __call__(self):
...         value = self.value
...         self.value += 1
...         return value
...
>>> counter = Counter(0)
>>> counter()
0
>>> counter()
1
>>> counter()
2

Generators can be seen as objects that implement the iteration protocol, but maintain state within the generator function instead of as instance attributes. This makes them much simpler to write, and much easier to read, than manually implementing the iteration protocol.

This example iterator never terminates, so we obtain values by manually calling next:

>>> class Iter(object):
...     def __init__(self, start):
...         self.value = start
...     def __iter__(self):
...         return self
...     def next(self):
...         value = self.value
...         self.value += 1
...         return value
...
>>> counter = Iter(0)
>>> next(counter)
0
>>> next(counter)
1
>>> next(counter)
2

The same iterator implemented as a generator is much simpler and the state is stored as local variables in the generator:

>>> def generator(start):
...     value = start
...     while True:
...         yield value
...         value += 1
...
>>> gen = generator(0)
>>> next(gen)
0
>>> next(gen)
1
>>> next(gen)
2

In recent versions of Python generators were enhanced with a send method to enable them to act like coroutines.

>>> def echo():
...     result = None
...     while True:
...         result = (yield result)
...
>>> f = echo()
>>> next(f)  # initialise generator
>>> f.send('fish')
'fish'
>>> f.send('eggs')
'eggs'
>>> f.send('ham')
'ham'

(Note that we can't send to an unstarted generator - hence the first call to next to initialise the generator.)

We can use the send method as a way of providing a callable object with state. I first saw this trick in this recipe for a highly optimized lru cache by Raymond Hettinger. The callable object is the send method itself, and as with any generator the state is stored as local variables.

Here's our counter as a generator:

>>> def counter(start):
...     yield None
...     value = start
...     while True:
...         ignored = yield value
...         value += 1
...
>>> gen = counter(0)
>>> next(gen)
>>> f = gen.send
>>> f(None)
0
>>> f(None)
1
>>> f(None)
2
>>> f(None)
3

Some observations. Firstly send takes one argument and one argument only. In this example we're ignoring the value sent into the generator, but send must be called with one argument and can't be called with more. So it's mostly useful for callable objects that take a single argument...

Secondly, this performs very well. Function calls are expensive (relatively) in Python because each invocation creates a new frame object (or reuses a zombie frame from the pool - but I digress) for storing the local variables etc. Generators are implemented with a "trick" that keeps the frame object alive, so that the next step of the generator can simply continue execution after the last yield. So our callable object implemented as a generator doesn't have the overhead of a normal function call...

The main advantage this approach has is that it's more readable than the version with __call__. To make it more pleasant to use, we can wrap creating our counter in a convenience function:

>>> def get_counter(start):
...     c = counter(start)
...     next(c)
...     return c.send
...
>>> c = get_counter(0)
>>> c(None)
0
>>> c(None)
1
>>> c(None)
2

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2012-01-22 15:05:46 | |

Categories: , Tags: ,


Simple mocking of open as a context manager

emoticon:black_hat Using open as a context manager is a great way to ensure your file handles are closed properly and is becoming common:

with open('/some/path', 'w') as f:
    f.write('something')

The issue is that even if you mock out the call to open it is the returned object that is used as a context manager (and has __enter__ and __exit__ called).

Using MagicMock from the mock library, we can mock out context managers very simply. However, mocking open is fiddly enough that a helper function is useful. Here mock_open creates and configures a MagicMock that behaves as a file context manager.

from mock import inPy3k, MagicMock

if inPy3k:
    file_spec = ['_CHUNK_SIZE', '__enter__', '__eq__', '__exit__',
        '__format__', '__ge__', '__gt__', '__hash__', '__iter__', '__le__',
        '__lt__', '__ne__', '__next__', '__repr__', '__str__',
        '_checkClosed', '_checkReadable', '_checkSeekable',
        '_checkWritable', 'buffer', 'close', 'closed', 'detach',
        'encoding', 'errors', 'fileno', 'flush', 'isatty',
        'line_buffering', 'mode', 'name',
        'newlines', 'peek', 'raw', 'read', 'read1', 'readable',
        'readinto', 'readline', 'readlines', 'seek', 'seekable', 'tell',
        'truncate', 'writable', 'write', 'writelines']
else:
    file_spec = file

def mock_open(mock=None, data=None):
    if mock is None:
        mock = MagicMock(spec=file_spec)

    handle = MagicMock(spec=file_spec)
    handle.write.return_value = None
    if data is None:
        handle.__enter__.return_value = handle
    else:
        handle.__enter__.return_value = data
    mock.return_value = handle
    return mock
>>> m = mock_open()
>>> with patch('__main__.open', m, create=True):
...     with open('foo', 'w') as h:
...         h.write('some stuff')
...
>>> m.assert_called_once_with('foo', 'w')
>>> m.mock_calls
[call('foo', 'w'),
 call().__enter__(),
 call().write('some stuff'),
 call().__exit__(None, None, None)]
>>> handle = m()
>>> handle.write.assert_called_once_with('some stuff')

And for reading files, using a StringIO to represent the file handle:

>>> from StringIO import StringIO
>>> m = mock_open(data=StringIO('foo bar baz'))
>>> with patch('__main__.open', m, create=True):
...     with open('foo') as h:
...         result = h.read()
...
>>> m.assert_called_once_with('foo')
>>> assert result == 'foo bar baz'

Note that the StringIO will only be used for the data if open is used as a context manager. If you just configure and use mocks they will work whichever way open is used.

This helper function will be built into mock 0.9.

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2012-01-13 12:18:35 | |

Categories: , , Tags: , ,


Mocks with some attributes not present

emoticon:avocado Mock objects, from the mock library, create attributes on demand. This allows them to pretend to be objects of any type.

What mock isn't so good at is pretending not to have attributes. You may want a mock object to return False to a hasattr call, or raise an AttributeError when an attribute is fetched. You can do this by providing an object as a spec for a mock, but that isn't always convenient.

Below is a subclass of Mock that allows you to "block" attributes by deleting them. Once deleted, accessing an attribute will raise an AttributeError.

from mock import Mock

deleted = object()
missing = object()

class DeletingMock(Mock):
    def __delattr__(self, attr):
        if attr in self.__dict__:
            return super(DeletingMock, self).__delattr__(attr)
        obj = self._mock_children.get(attr, missing)
        if obj is deleted:
            raise AttributeError(attr)
        if obj is not missing:
            del self._mock_children[attr]
        self._mock_children[attr] = deleted

    def __getattr__(self, attr):
        result = super(DeletingMock, self).__getattr__(attr)
        if result is deleted:
            raise AttributeError(attr)
        return result
>>> mock = DeletingMock()
>>> hasattr(mock, 'm')
True
>>> del mock.m
>>> hasattr(mock, 'm')
False
>>> del mock.f
>>> mock.f
Traceback (most recent call last):
  ...
AttributeError: f

This functionality will probably be built into 0.9, or whichever version of mock comes after 0.8...

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2012-01-12 12:33:15 | |

Categories: , , Tags: , ,


Sphinx doctests and the execution namespace

emoticon:ir_scope I've finally started work on the documentation for mock 0.8 release, and much of it involves converting the write-ups I did in the blog entries.

The mock documentation is built with the excellent Sphinx (of course!) and as many as possible of the examples in the documentation are doctests, to ensure that the examples are still up to date for new releases.

doctests mimic executing your examples at the interactive interpreter, but they aren't exactly the same. One big difference is that the execution context for a doctest is not a real module, but a dictionary. This is particularly important for mock examples, because the following code will work at the interactive interpreter but not in a doctest:

>>> from mock import patch
>>> class Foo(object):
...      pass
...
>>> with patch('__main__.Foo') as mock_foo:
...   assert Foo is mock_foo
...

The name (__name__) of the doctest execution namespace is __builtin__, but this is a lie. The namespace is a dictionary, internal to the DocTest. Whilst executing doctests under Sphinx, the real __main__ module is the sphinx-build script.

To get the example code above working I either need to rewrite it (and make it less readable - probably by shoving the class object I'm patching out into a module), or I need to somehow make the current execution context into __main__.

Fortunately the Sphinx doctest extension provides the doctest_global_setup config option. This allows me to put a string into my conf.py, which will be executed before the doctests of every page in my documentation (doctests from each page share an execution context).

I solved the problem by creating a proxy object that delegates attribute access to the current globals() dictionary. I shove this into sys.modules as the __main__ module (remembering to store a reference to the real __main__ so that it doesn't get garbage collected). When patch accesses or changes an attribute on __main__ it actually uses the current execution context.

Here's the code from conf.py:

doctest_global_setup = """
import sys
import __main__

# keep a reference to __main__
__main = __main__

class ProxyModule(object):
    def __getattr__(self, name):
        return globals()[name]
    def __setattr__(self, name, value):
        globals()[name] = value
    def __delattr__(self, name):
        del globals()[name]

sys.modules['__main__'] = ProxyModule()
"""

doctest_global_cleanup = """
sys.modules['__main__'] = __main
"""

The corresponding doctest_global_cleanup option restores the real __main__ when the test completes.

Note

In the comments Nick Coghlan suggests a simplification for the ProxyModule:

class ProxyModule(object):
    def __init__(self):
        self.__dict__ = globals()

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2012-01-01 00:28:03 | |

Categories: , Tags: , ,


matplotlib and numpy for Python 2.7 on Mac OS X Lion

emoticon:objects Unfortunately, due to an API change, the latest released version of matplotlib is incompatible with libpng 1.5. Take a wild guess as to which version comes with Mac OS X Lion. :-/

Fortunately this is fixed in the matplotlib repository. Here's how I got matplotlib working on Mac OS X Lion (with Python 2.7 - but these instructions should work fine for other versions of Python too).

First matplotlib requires numpy. The latest version is 1.6.1, from here. The precompiled Mac OS X binaries are compiled to be compatible with Mac OS X 1.3 and up, which means they are 32 bit only. By default Python will run as 64 bit on OS X Lion, which means you'll see this when attempting to import numpy:

>>> import numpy
Traceback (most recent call last):
 ...
ImportError: dlopen(/.../site-packages/numpy/core/multiarray.so, 2): no suitable image found.  Did find:
        /.../site-packages/numpy/core/multiarray.so: no matching architecture in universal wrapper

You can get round this by launching python as a 32 bit process. I have the following alias in my .bash_profile:

alias py32="arch -i386 python"

The next problem is the matplotlib one. This blog entry shows how to build matplotlib from the git repo, using homebrew. I don't want to use a homebrew installed Python, so I modified the instructions to only install the dependencies with homebrew. I also set the correct flags to compile a 32bit version of matplotlib to match the 32bit numpy.

brew install pkg-config
brew install gfortran
cd matplotlib
export ARCHFLAGS="-arch i386"
py32 setup.py build
py32 setup.py install

And it appears to work. So far anyway:

>>> import pylab
>>> x = pylab.randn(10000)
>>> pylab.hist(x, 100)
>>> pylab.show()

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2011-09-05 00:18:13 | |

Categories: , , , Tags: , ,


Mock subclasses and their attributes

emoticon:newspaper This blog entry is about creating subclasses of mock.Mock. mock is a library for testing in Python. It allows you to replace parts of your system under test with mock objects. The latest stable release is 0.7.2, which you can download from pypi.

There are various reasons why you might want to subclass Mock. One reason might be to add helper methods. Here's a silly example:

>>> from mock import MagicMock
>>> class MyMock(MagicMock):
...     def has_been_called(self):
...         return self.called
...
>>> mymock = MyMock(return_value=None)
>>> mymock.has_been_called()
False
>>> mymock()
>>> mymock.has_been_called()
True

The standard behaviour for Mock instances is that attributes and the return value are of the same type as the mock they are accessed on. This is so that Mock attributes are Mocks and MagicMock attributes are MagicMocks [1]. So if you're subclassing to add helper methods then they'll also be available on the attributes and return value mock of instances of your subclass.

>>> mymock.foo.has_been_called()
False
>>> mymock.foo()
<mock.MyMock object at 0x5747b0>
>>> mymock.foo.has_been_called()
True

Sometimes this is inconvenient. For example, one user is subclassing mock to created a Twisted adaptor. Having this applied to attributes too actually causes errors.

Mock (in all its flavours) uses a method called _get_child_mock to create these "sub-mocks" for attributes and return values. You can prevent your subclass being used for attributes by overriding this method. The signature is that it takes arbitrary keyword arguments (**kwargs) which are then passed onto the mock constructor:

>>> class Subclass(MagicMock):
...     def _get_child_mock(self, **kwargs):
...         return MagicMock(**kwargs)
...
>>> mymock = Subclass()
>>> mymock.foo
<MagicMock name='mock.foo' id='5696720'>
>>> assert isinstance(mymock, Subclass)
>>> assert not isinstance(mymock.foo, Subclass)
>>> assert not isinstance(mymock(), Subclass)

This works with mock 0.7 and 0.8 (and possibly earlier versions but they're no longer supported officially), and in 0.8 will be documented and tested so you can rely on it continuing to work.

[1]Mock 0.8 introduces a couple of new mock types that aren't callable. Their attributes are callable (otherwise non-callable mocks couldn't have methods), so they're an exception to this rule.

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2011-07-18 17:22:00 | |

Categories: , , Tags: ,


Mocking Generator Methods

emoticon:black_hat Another mock recipe, this one for mocking generator methods.

A Python generator is a function or method that uses the yield statement to return a series of values when iterated over [1].

A generator method / function is called to return the generator object. It is the generator object that is then iterated over. The protocol method for iteration is __iter__, so we can mock this using a MagicMock.

Here's an example class with an "iter" method implemented as a generator:

>>> class Foo(object):
...     def iter(self):
...         for i in [1, 2, 3]:
...             yield i
...
>>> foo = Foo()
>>> list(foo.iter())
[1, 2, 3]

How would we mock this class, and in particular its "iter" method?

To configure the values returned from the iteration (implicit in the call to list), we need to configure the iterator returned by the call to foo.iter().

>>> values = [1, 2, 3]
>>> mock_foo = MagicMock()
>>> iterable = mock_foo.iter.return_value
>>> iterator = iter(values)
>>> iterable.__iter__.return_value = iterator
>>> list(mock_foo.iter())
[1, 2, 3]

The above example is done step-by-step. The shorter version is:

>>> mock_foo = MagicMock()
>>> mock_foo.iter.return_value.__iter__.return_value = iter([1, 2, 3])
>>> list(mock_foo.iter())
[1, 2, 3]

This is now also in the docs on the mock examples page. There's now quite a collection of useful mock recipes there, so even if you're an experienced mock user it is worth a browse.

[1]There are also generator expressions and more advanced uses of generators, but we aren't concerned about them here. A very good introduction to generators and how powerful they are is: Generator Tricks for Systems Programmers.

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2011-06-13 17:24:10 | |

Categories: , , Tags: , ,


Another approach to mocking properties

emoticon:envelope mock is a library for testing in Python. It allows you to replace parts of your system under test with mock objects. The main feature of mock is that it's simple to use, but mock also makes possible more complex mocking scenarios.

This is my philosophy of API design as it happens: simple things should be simple but complex things should be possible.

Several of these more complex scenarios are shown in the further examples section of the documentation. I've just updated one of these, the example of mocking properties.

Properties in Python are descriptors. When they are fetched from the class of an object they trigger code that is then executed. The code that is executed is the method that you have wrapped as the property getter.

Note that there is a special rule for attribute lookup on certain types of descriptors, which include properties. Even if an instance attribute exists, the class attribute will still be used instead. This is an exception to the normal attribute lookup rule that instance attributes are fetched in preference to class attributes. This is important because it means that when you want to mock a property you have to do it on the class and can't simply stick an attribute onto an object.

If you're using mock 0.7, with its support for magic methods, we can patch the property name and add a __get__ method to our mock. The presence of the __get__ method makes it a descriptor, so we can use it as a property:

>>> from mock import Mock, patch
>>> class Foo(object):
...    @property
...    def fish(self):
...      return 'fish'
...
>>> with patch.object(Foo, 'fish') as mock_fish:
...   mock_fish.__get__ = Mock(return_value='mocked fish')
...   foo = Foo()
...   print foo.fish
...
mocked fish
>>> mock_fish.__get__.assert_called_with(mock_fish, foo, Foo)

In this example mock_fish replaces the property and the mock we put in place of __get__ becomes the mocked getter method. As we're patching on the class this affects all instances of Foo.

There's no point in using MagicMock for this. MagicMock normallly makes using the Python protocol methods simpler by preconfiguring them. As you can see from the example above, mocking __get__ is supported but it isn't hooked up by default. It wouldn't be helpful if mocking any method on a class replaced it with a mock that acted as a descriptor, so if you want a mock to behave as a descriptor then you have to configure __get__, __set__ and __delete__ yourself.

Here's an alternative approach that works with all recent-ish versions of mock:

>>> from mock import Mock, patch
>>> class PropertyMock(Mock):
...   def __get__(self, instance, owner):
...     return self()
...
>>> prop_mock = PropertyMock()
>>> with patch.object(Foo, 'fish', prop_mock):
...   foo = Foo()
...   prop_mock.return_value = 'mocked fish'
...   print foo.fish
...
mocked fish
>>> prop_mock.assert_called_with()

As an added bonus, both of these examples work even if the Foo instance is created outside of the patch block. So long as the code using the property is executed whilst the patch is in place the attribute lookup will find our mocked version.

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2011-06-06 20:17:37 | |

Categories: , , Tags: , , , ,


namedtuple and generating function signatures

emoticon:cat Kristjan Valur, the chief Python developer at CCP games (creators of Eve Online), has posted an interesting blog entry about the use of exec in namedtuple.

namedtuple is a relatively recent, and extraordinary useful, part of the Python standard library. It provides tuple subclasses with access through named fields instead of just by index.

Kristjan's blog entry is cool because of its opening words alone: In our port of Python 2.7 to the PS3 console... This is almost certainly related to the recently announced EVE Online FPS Console Game DUST 514.

As the blog entry goes on to point out, namedtuple is implemented by generating and exec'ing code for the classes it creates. I have a natural developer's distrust of exec, but as Raymond pointed out in a recent talk: execing code is not a security risk, execing untrusted code is. Whether or not you like this particular use of exec, it is a core language feature and I'm surprised that namedtuple was the only thing that broke when they removed it. (As namedtuple and a couple of additional uses discussed below demonstrate, exec is also a perfectly valid metaprogramming technique.)

All that aside, it is interesting how much of the core functionality of namedtuple (with lots of the bells and whistles missing) you can get in just 11 lines of Python:

from operator import itemgetter

def namedtuple2(name, names):
    def __new__(cls, *values):
        assert len(values) == len(names)
        return tuple.__new__(cls, values)
    def __repr__(self):
        return '%s%s' % (name, tuple.__repr__(self))
    attrs = {'__new__': __new__, '__repr__': __repr__}
    for index, _name in enumerate(names):
        attrs[_name] = property(itemgetter(index))
    return type(name, (tuple,), attrs)
>>> Name = namedtuple2('Name', 'one two three'.split())
>>> n = Name(1, 2, 3)
>>> n
Name(1, 2, 3)
>>> n.one
1
>>> n.two
2
>>> n.three
3
>>> n = Name(1, 2, 3, 4)
Traceback (most recent call last):
 ...
AssertionError

The most obviously missing functionality here is keyword argument support in both the object constructor and the repr. For a full implementation of namedtuple without using exec see the patch here: http://bugs.python.org/issue3974

I would marginally prefer a version that didn't use exec, but implementation maintainability is a much more important consideration and Raymond (who is the creator of namedtuple) feels that the current implementation is better from that point of view.

Clearly namedtuple can be implemented without the use of exec (or eval), however some of the functionality in the decorator module by Michele Simionato can't. The mocksignature functionality in the mock module suffers from the same problem and gets round it using the same technique as the decorator module.

What they're both doing is building functions with the same signature as another function, those "generated functions" then delegate to another function. Both the decorator module and mocksignature do this in order to provide a new function that has the same call signature as the original.

If you don't care about the call signature then there is an easy pattern:

def function(*args, **kwargs):
    return delegated_function(*args, **kwargs)

When function is called it calls delegated_function with exactly the same arguments as function was called with. The issue is that from an introspection point of view you have now lost the call signature. Generating the code (or an ast, but same difference) and then executing it seems to be the only way round this problem in Python. This does point out a weakness in the language, but I can't even imagine what the "missing language feature" should look like, so I don't have any solutions to offer.

The issue is that named arguments become local variables in the scope of the function. So to write code that uses those arguments you need to know the name of the arguments - which you only know at runtime. You could look them up with locals(), but that is very bad for other implementations (for both Python and IronPython accessing locals() will switch off JIT optimisations). Even if you could build a function with a runtime specified signature you wouldn't be able to provide generic code that uses those arguments (passes them onto the delegated function); that code has to be generated too.

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2011-05-30 15:00:31 | |

Categories: , Tags: ,


Nothing is Private: Python Closures (and ctypes)

emoticon:computer As I'm sure you know Python doesn't have a concept of private members. One trick that is sometimes used is to hide an object inside a Python closure, and provide a proxy object that only permits limited access to the original object.

Here's a simple example of a hide function that takes an object and returns a proxy. The proxy allows you to access any attribute of the original, but not to set or change any attributes.

def hide(obj):
    class Proxy(object):
        __slots__ = ()
        def __getattr__(self, name):
            return getattr(obj, name)
    return Proxy()

Here it is in action:

>>> class Foo(object):
...     def __init__(self, a, b):
...         self.a = a
...         self.b = b
...
>>> f = Foo(1, 2)
>>> p = hide(f)
>>> p.a, p.b
(1, 2)
>>> p.a = 3
Traceback (most recent call last):
  ...
AttributeError: 'Proxy' object has no attribute 'a'

After the hide function has returned the proxy object the __getattr__ method is able to access the original object through the closure. This is stored on the __getattr__ method as the func_closure attribute (Python 2) or the __closure__ attribute (Python 3). This is a "cell object" and you can access the contents of the cell using the cell_contents attribute:

>>> cell_obj = p.__getattr__.func_closure[0]
>>> cell_obj.cell_contents
<__main__.Foo object at 0x...>

This makes hide useless for actually preventing access to the original object. Anyone who wants access to it can just fish it out of the cell_contents.

What we can't do from pure-Python is*set* the contents of the cell, but nothing is really private in Python - or at least not in CPython.

There are two Python C API functions, PyCell_Get and PyCell_Set, that provide access to the contents of closures. From ctypes we can call these functions and both introspect and modify values inside the cell object:

>>> import ctypes
>>> ctypes.pythonapi.PyCell_Get.restype = ctypes.py_object
>>> py_obj = ctypes.py_object(cell_obj)
>>> f2 = ctypes.pythonapi.PyCell_Get(py_obj)
>>> f2 is f
True
>>> new_py_obj = ctypes.py_object(Foo(5, 6))
>>> ctypes.pythonapi.PyCell_Set(py_obj, new_py_obj)
0
>>> p.a, p.b
(5, 6)

As you can see, after the call to PyCell_Set the proxy object is using the new object we put in the closure instead of the original. Using ctypes may seem like cheating, but it would only take a trivial amount of C code to do the same.

Two notes about this code.

  1. It isn't (of course) portable across different Python implementations
  2. Don't ever do this, it's for illustration purposes only!

Still, an interesting poke around the CPython internals with ctypes. Interestingly I have heard of one potential use case for code like this. It is alleged that at some point Armin Ronacher was using a similar technique in Jinja2 for improving tracebacks. (Tracebacks from templating languages can be very tricky because the compiled Python code usually bears a quite distant relationship to the original text based template.) Just because Armin does it doesn't mean you can though... Wink

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2011-05-30 13:40:05 | |

Categories: , Tags: ,


Hosted by Webfaction

Counter...