Python Programming, news on the Voidspace Python Projects and all things techie.
Open Source Licensing and Contributions
There have been discussions on the issues of licensing and accepting contributions to open source projects on the Python-dev and the testing in Python mailing lists.
This is an area that can be very confusing, and potentially problematic for open source projects. Just because a project is licensed under a free software license doesn't automatically say anything about the status of code contributed to the project. The copyright of contributed code is owned by the person who wrote it and if you merge it with your project you create a derivative work owned jointly by all contributors. You can't license the work to others (or change the license) without the explicit permission of all those who own the copyright. The 'standard' ways round this are either to require all contributors to assign copyright to the project or to have all contributors sign an agreement licensing all their contributions to the project. The second approach is the one taken by the Python project.
This advice, applicable to Python itself, was posted to the Python-Dev mailing list by Martin von Loewis:
Van's advise is as follows:
There is no definite ruling on what constitutes "work" that is copyright-protected; estimates vary between 10 and 50 lines. Establishing a rule based on line limits is not supported by law. Formally, to be on the safe side, paperwork would be needed for any contribution (no matter how small); this is tedious and probably unnecessary, as the risk of somebody suing is small. Also, in that case, there would be a strong case for an implied license.
So his recommendation is to put the words
"By submitting a patch or bug report, you agree to license it under the Apache Software License, v. 2.0, and further agree that it may be relicensed as necessary for inclusion in Python or other downstream projects."
into the tracker; this should be sufficient for most cases. For committers, we should continue to require contributor forms.
Contributor forms can be electronic, but they need to name the parties, include a signature (including electronic), and include a company contribution agreement as necessary.
For more information on copyright and how it applies to open source development I highly recommend the book by Van Lindberg Intellectual Property and Open Source.
Like this post? Digg it or Del.icio.us it. Looking for a great tech job? Visit the Hidden Network Jobs Board.
Posted by Fuzzyman on 2009-07-02 16:59:00 | |
Categories: Python, General Programming Tags: open source, licenses
Exception handling and duck typing
Exceptions are one of the great features of high level languages that making coding less tedious. Instead of manually checking for possible errors and returning error codes we can use exceptions. There is a slightly tautological saying; that exceptions should only be used for exceptional circumstances. This, according to Bruce Eckel, comes from the C++ days when exception handling became mainstream. The goal of C++ was for exception handling code to have no performance impact on your code. This made actually raising exceptions much slower [1].
In fact the use of exception handling for flow control has been enshrined within Python through the use of the StopIteration exception to indicate that an iterator is exhausted. This is an implementation detail, but one that you need to be aware of if you write your own iterators or are manually advancing an iterator by calling next.
There is another idiom that is often advocated in Python; the use of exceptions for duck typing. Where you have a code path receiving an object that may be one of several types you often want to handle different objects in different ways. Rather than do strict type checking with isinstance it is more idiomatic in Python to base how you use objects on what operations they support - this is the essence of duck typing. In general there are two ways of implementing this: 'look before you leap' and 'it is better to ask forgiveness than permission'.
An example of 'look before you leap' is to check whether an object has the attribute / method you want to use:
something.methodname()
Or an alternative that only does the attribute lookup once:
if method is not None:
method()
The 'better to ask forgiveness' pattern uses exception handling instead:
something.methodname()
except AttributeError:
pass
Exception handling can also be used for flow control, for example as a convenient way of breaking out of nested for loops:
try:
for x in range(max_x):
for y in range(max_y):
if match(x, y):
raise Stop
except Stop:
# match found
else:
# match not found
For this use I prefer to use an inner function with an early return:
for x in range(max_x):
for y in range(max_y):
if match(x, y):
return x, y
result = find_match()
if result is None:
# match not found
else:
x, y = result
I'm not a fan of abusing exception handling for flow control or even duck typing. The main reason for this is code readability. If you see exception handling code it is natural to assume that it is there to handle exceptions. If exceptions are also used for flow control or duck typing then seeing exception handling code tells you nothing about the intent of the code and have you to read it in detail to work out what it is for. As there are usually alternatives that are just as readable I tend to prefer to only use exception handling for exceptional circumstances unless there is a compelling reason.
In Python 2.6 / 3.X the new Abstract Base Classes permit isinstance to support duck typing. This new machinery provides mechanisms like the ABCMeta metaclass, the __subclasshook__ hook, and a large number of abstract base classes in the collections module that you can inherit from. These not only signal the available behaviour of your class (classes that inherit from Set support set operations) but also provide relevant methods if you implement a core subset. At last it will be possible to distinguish between mapping and sequence types implemented as pure-Python classes! [2]
| [1] | A design philosophy that is still true in the Microsoft .NET framework, which is the reason why IronPython is faster than CPython for try...except blocks where an exception is not raised and much slower than CPython for try...except blocks where exceptions are raised. |
| [2] | As the methods for indexing and deletion for both mapping and sequence types are __getitem__, __setitem__ and __delitem__ it has always been difficult to tell them apart. |
Like this post? Digg it or Del.icio.us it. Looking for a great tech job? Visit the Hidden Network Jobs Board.
Posted by Fuzzyman on 2009-07-01 18:50:43 | |
Categories: Python Tags: interfaces, exceptions, duck typing
Movable IDLE for Python 2.5 on Windows
Movable IDLE is a standalone version of the IDLE Python IDE. Movable IDLE is part of the Movable Python project and can be run (Windows only I'm afraid) from a USB memory stick and without installing Python. It comes with the full Python standard library.

I've built a new version. The only differences from the previous release are that it is now built with Python 2.5 and no longer displays an annoying dialog on load:
It's a while since I've built a distribution using the Movable Python codebase. It seems fine, but feel free to report any problems you encounter or bugs you find.
Like this post? Digg it or Del.icio.us it. Looking for a great tech job? Visit the Hidden Network Jobs Board.
Posted by Fuzzyman on 2009-06-24 21:38:27 | |
Categories: Python, Projects, Tools Tags: idle, Movable Python, release, Windows, IDE
The Python Object Model Revisited (data descriptors)
A few weeks ago I demonstrated the complexity of the Python object model by fetching docstrings from objects. A while after posting it I thought of a bug - or at least a way in which it could return the wrong result when looking up an attribute on an object. It will probably come as no surprise that this is due to the descriptor protocol.
Descriptors are special types of objects that have __get__ and or __set__ and __delete__ methods and have special behaviour when fetched, set or deleted as object attributes. They are how methods, class methods, static methods, properties and __slots__ are implemented in Python.
Descriptors that have both __get__ and __set__ are called data descriptors (properties are the canonical example), descriptors with only __get__ are non-data descriptors (methods being the canonical example). Data descriptors have interesting behaviour when they are on a class which has the same member in the instance dictionary.
Instance members are stored in the __dict__ attribute of the object. Normally if this instance dictionary has a member then fetching that member will pull it out of the dictionary. The exception is that if the class has a data-descriptor with the same name then that will be invoked instead of the object in the instance dictionary. This is easy to demonstrate:
... @property
... def a(self):
... return 'property'
...
>>> a = A()
>>> a.__dict__['a'] = 'attribute'
>>> a.a
'property'
So a data-descriptor on the class will override a member with the same name on the instance - but the 18 lines of code I wrote before for fetching docstrings from attributes will always look on the instance first.
The same is true for inherited data-descriptors:
...
>>> b = B()
>>> b.__dict__['a'] = 'attribute'
>>> b.a
'property'
Non-data descriptors don't override instance attributes and data-descriptors on a base class don't override normal class attributes on a subclass.
To handle this we need to check both the instance and walk the inheritance hierarchy. If we find the member we are looking for in both then we check the member from the class for a __set__ method. If the member from the class (or one of its base classes) has a __set__ member then we return that - otherwise we return the member from the instance.
Our modified full code that takes this into account has grown to 22 lines and now looks like:
import inspect
def get_doc(obj, member):
found = []
if hasattr(obj, '__dict__') and member in obj.__dict__:
found.append(obj.__dict__[member])
if isinstance(obj, (type, types.ClassType)):
search_order = inspect.getmro(obj)
else:
search_order = inspect.getmro(obj.__class__)
for entry in search_order:
if member in entry.__dict__:
if hasattr(entry.__dict__[member], '__set__'):
return entry.__dict__[member].__doc__
found.append(entry.__dict__[member])
return found[0].__doc__
def get_docstrings(obj):
try:
members = dir(obj)
except Exception:
members = []
return [(member, get_doc(obj, member)) for member in members]
Note
In practise there is another exception that we haven't handled here. Although you can override methods with instance attributes (very useful for monkey patching methods for test purposes) you can't do this with the Python protocol methods. These are the 'magic methods' whose names begin and end with double underscores. When invoked by the Python interpreter they are looked up directly on the class and not on the instance (however if you look them up directly - e.g. x.__repr__ - normal attribute lookup rules apply).
There is a corner case (that I alluded to in my previous post), classes can define __slots__ and create a dummy __dict__ member. If this member isn't a dictionary then our code will barf horribly - but really this is such an evil corner case that I'm not going to worry about it.
I have seen one use case for __slots__ in combination with a fake __dict__ member: proxying attribute access. This is a part of the werkzeug web framework - the LocalProxy class defines __dict__ as a property which returns the __dict__ member of the object it is proxying...
Like this post? Digg it or Del.icio.us it. Looking for a great tech job? Visit the Hidden Network Jobs Board.
Posted by Fuzzyman on 2009-06-22 23:08:08 | |
Categories: Python, Hacking Tags: descriptors, object model
discover: Test discovery for unittest backported to Python 2.4+
I kind of promised you no more entries on unittest for a while, but oh well.
I've backported the test discovery in Python-trunk, what will become Python 2.7 & Python 3.2. Test discovery allows you to run all the unittest based tests (or just a subset of them) in your project without you having to write your own test collection or running machinery. Once installed, test discovery can be invoked with python -m discover. I've tested the discover module with Python 2.4 and 3.0.
Most of the work of backporting was providing an implementation of os.path.relpath (added in Python 2.6) and refactoring the command line handling for standalone use.
The discover module also implements the load_tests protocol which allows you to customize test loading from modules and packages. Test discovery and load_tests are implemented in the DiscoveringTestLoader which can be used from your own test framework.
This is the test discovery mechanism and load_tests protocol for unittest backported from Python 2.7 to work with Python 2.4 or more recent (including Python 3).
discover can be installed with pip or easy_install. After installing switch the current directory to the top level directory of your project and run:
python -m discover python discover.py
This will discover all tests (with certain restrictions) from the current directory. The discover module has several options to control its behavior (full usage options are displayed with python -m discover -h):
Usage: discover.py [options]
Options:
-v, --verbose Verbose output
-s directory Directory to start discovery ('.' default)
-p pattern Pattern to match test files ('test*.py' default)
-t directory Top level directory of project (default to
start directory)
For test discovery all test modules must be importable from the top
level directory of the project.
For example to use a different pattern for matching test modules run:
python -m discover -p '*test.py'
(Remember to put quotes around the test pattern or shells like bash will do shell expansion rather than passing the pattern through to discover.)
Test discovery is implemented in discover.DiscoveringTestLoader.discover. As well as using discover as a command line script you can import DiscoveringTestLoader, which is a subclass of unittest.TestLoader, and use it in your test framework.
This method finds and returns all test modules from the specified start directory, recursing into subdirectories to find them. Only test files that match pattern will be loaded. (Using shell style pattern matching.)
All test modules must be importable from the top level of the project. If the start directory is not the top level directory then the top level directory must be specified separately.
The load_tests protocol allows test modules and packages to customize how they are loaded. This is implemented in discover.DiscoveringTestLoader.loadTestsFromModule. If a test module defines a load_tests function then tests are loaded from the module by calling load_tests with three arguments: loader, standard_tests, None.
If a test package name (directory with __init__.py) matches the pattern then the package will be checked for a load_tests function. If this exists then it will be called with loader, tests, pattern.
If load_tests exists then discovery does not recurse into the package, load_tests is responsible for loading all tests in the package.
The pattern is deliberately not stored as a loader attribute so that packages can continue discovery themselves. top_level_dir is stored so load_tests does not need to pass this argument in to loader.discover().
discover.py is maintained in a google code project (where bugs and feature requests should be posted): http://code.google.com/p/unittest-ext/
The latest development version of discover.py can be found at: http://code.google.com/p/unittest-ext/source/browse/trunk/discover.py
Like this post? Digg it or Del.icio.us it. Looking for a great tech job? Visit the Hidden Network Jobs Board.
Posted by Fuzzyman on 2009-06-20 19:35:56 | |
Categories: Python, Projects Tags: testing, unittest, discovery
Catching up: Pythonutils 0.4.0, akismet 0.2.0 and article updates
About two and a half years ago I started writing the book. During the next two years I received a steady trickle of feature requests, bug reports and patches for the various projects and articles I maintain (or pretend to maintain). For the most part I stuck these emails in an 'outstanding' folder promising to deal with them when the book was done (which at the time seemed like it would never happen).
IronPython in Action was properly finished at the beginning of the year (actual writing finished a while before that) and available in the shops nearly 3 months ago; I should be getting sales figures for the first quarter any day now. I still owe you a blog entry on the experience of writing a technical book for Manning. (Executive summary: it isn't such a hot idea whilst working full time and commuting four hours a day.)
Well, incredibly, I've been working through my backlog. I started with 138 emails in my outstanding folder - which included several duplicates and an almost entire Python-dev thread on unittest so not as many as it sounds - and now I'm down to 21 emails. Here are some of the things I've been working on:
akismet 0.2.0
akismet.py is a module for accessing the Akismet anti-spam web service. Useful for blogs or applications which accept user comments and want to check for spam.
This release (0.2.0) adds compatibility with Google AppEngine.
All changes in 0.2.0:
- If the data dictionary passed to comment_check doesn't have a 'blog' entry it will be added even if build_data is False. Thanks to Mark Walling.
- Fix for compatibility with Google AppEngine. Thanks to Matt King.
- Added a setup.py - install with pip or easy_install
pythonutils 0.4.0
This package is a collection of general Python utility modules, mostly old now. The old modules are not being actively developed and are in bugfix only maintenance mode. The main reason for me doing a new release is to stop people reporting the same bug in pathutils over and over again...
Articles
IronPython Winforms Part 9: Background Worker Threads
Code corrections pointed out by Davy Mitchell
-
Some time ago Python basic authentication handling changed to require the use of the protocol in urls passed to HTTPPasswordMgrWithDefaultRealm().add_password. The article has been updated to reflect this.
-
Addition of notes on BadStatusLine and HttpException Exceptions.
Like this post? Digg it or Del.icio.us it. Looking for a great tech job? Visit the Hidden Network Jobs Board.
Posted by Fuzzyman on 2009-06-20 18:37:59 | |
Categories: Writing, Website, Python, Projects Tags: articles, release, pythonutils, akismet
Parametrized Tests and unittest
Yet another blog entry on unittest; this is the last one in my list so I'm not planning any more for a while. Something that both nose and py.test provide that unittest (the [1] Python standard library testing framework) doesn't is a builtin mechanism for writing parametrized tests. The technique that both nose and py.test use (currently anyway) is to allow your test methods to be generators that return a series of tests. The testing framework then runs all these tests for you.
Whenever I've needed to run a series of similar tests with different input parameters I've always used a simple loop; something like:
for x in range(100):
for y in range(100):
self.assertSomethingForXandY(x, y)
The problem with this approach is that as soon as you have a failure for any of the x, y value combinations the test will stop running. In some circumstances it would be much better for all the tests to run and have all the failures reported instead of just the first.
With nose, you could write the above test like this:
for x in range(100):
for y in range(100):
yield self.assertSomethingForXandY, x, y
Nose would detect that the test is a generator and collect the functions (along with their arguments) that it yields and run them independently. The disadvantage of this approach is that you can't know up front how many tests you have (and indeed it could change every time you run the tests) and neither are they isolated from each other (they share the fixture).
Although unittest doesn't include an equivalent it is easy to achieve the same thing and there are several possible approaches. The two I've come up with, prompted by another discussion on the Testing in Python mailing list and with Brandon Craig Rhodes, are available as params.py from my unittest-ext sandbox project (where I tinker with unittest related stuff from time to time). (Note - after showing you two possible approaches I'll show you some better ways that other people have found for solving the same problem.)
The first uses a metaclass in concert with a decorator. You decorate methods with a list of dictionaries - for every dictionary in the list the method will be called with the parameters from the dictionary. (It'll be easier to understand when I show you some code using it.) The metaclass examines all decorated methods at class creation time and adds new test methods to the class.
from types import FunctionType
class Paramaterizer(type):
def __new__(meta, class_name, bases, attrs):
for name, item in attrs.items():
if not isinstance(item, FunctionType):
continue
params = getattr(item, 'params', None)
if params is None:
continue
for index, args in enumerate(params):
def test(self, args=args, name=name):
assertMethod = getattr(self, name)
assertMethod(**args)
test.__doc__ = """%s with args: %s""" % (name, args)
test_name = 'test_%s_%s' % (name, index + 1)
test.__name__ = test_name
if test_name in attrs:
raise Exception('Test class %s already has a method called: %s' %
(class_name, test_name))
attrs[test_name] = test
return type.__new__(meta, class_name, bases, attrs)
def with_params(params):
def inner(func):
func.params = params
return func
return inner
class TestCaseWithParams(unittest.TestCase):
__metaclass__ = Paramaterizer
You don't need to use the metaclass directly, instead subclass TestCaseWithParams:
@with_params([dict(a=1, b=2), dict(a=3, b=3), dict(a=5, b=4)])
def assertEqualWithParams(self, a, b):
self.assertEqual(a, b)
@with_params([dict(a=1, b=0), dict(a=3, b=2)])
def assertZeroDivisionWithParams(self, a, b):
self.assertRaises(ZeroDivisionError, lambda: a/b)
The disadvantage of this approach is that you have to know (or calculate) all of the parameters at class creation time instead of when the test runs. The advantage is that the number of tests is known ahead of running the tests - so countTestCases on the TestSuite works as normal and each failure is recorded individually.
Another approach is to use the same generator technique as nose / py.test with a decorator that runs all the tests yielded by the generator.
def inner(self):
failures = []
errors = []
for test, args in func(self):
try:
test(*args)
except self.failureException, e:
failures.append((test.__name__, args, e))
except KeyboardInterrupt:
raise
except:
# using sys.exc_info means we also catch string exceptions
e = sys.exc_info()[1]
errors.append((test.__name__, args, e))
msg = '\n'.join('%s%s: %s: %s' % (name, args, e.__class__.__name__, e) for (name, args, e) in failures + errors)
if errors:
raise Exception(msg)
raise self.failureException(msg)
return inner
class Test2(unittest.TestCase):
@test_generator
def testSomething(self):
for a, b in ((1, 2), (3, 3), (5, 4)):
yield self.assertEqual, (a, b)
def raises():
raise Exception('phooey')
yield raises, ()
This is a bit less 'heavy' than using a metaclass. Decorated tests are all run to completion. If any test fails or errors then an appropriate failure is raised - with the message listing all the failures. It has the advantage of allowing tests to be created at test execution time, but the disadvantage of all failures only counting as a single failure. The total number of tests counted will only count generative tests as a single test. If you run the code above you'll see how errors are reported and it is ok (could do better - must try harder). It is also easy to use with any unittest based test framework.
Of course other people have come up with better ideas - which I may evaluate for integrating into unittest. They do still suffer from the problem of non-deterministic number of tests (breaking the countTestCases part of the unittest protocol) but this is unavoidable with this feature.
Konrad Delong posted one solution to his blog: Reporting assertions as a way for test parameterization. The code is here. He uses a decorator to collect the failures / errors and modifies TestCase.run to be aware of them. I like this technique.
Robert Collins has a different solution, which at the heart uses a similar technique but is more general and powerful. This is his testscenarios project. (Every time I try to actually find the code on a launchpad project I go round in circles for a while first. Anyway - it's here.) The project is described thusly:
testscenarios provides clean dependency injection for python unittest style tests. This can be used for interface testing (testing many implementations via a single test suite) or for classic dependency injection (provide tests with dependencies externally to the test code itself, allowing easy testing in different situations).
Instead of just individual tests it allows you to parameterize whole test cases - so you can do 'interface' testing where you swap out the backend implementation and check that all tests pass for various different backends.
The basic nose / py.test technique for generator tests is a dirty hack. They introspect the test method code objects to see which of them are generators. Holger Krekel, core developer of py.test, also thinks that they offer little real advantage over loops and is looking to replace them in py.test with a more powerful system. This uses pytest_generate_tests and he describes it in: Parametrizing Python tests, generalized.
This new system is more powerful, but it seems to make the simple cases more difficult. If Holger is right in that a generalized mechanism that only caters for the simple cases doesn't really have much advantage then this new system may indeed be a winner.
| [1] | Yes Zeth the testing framework. doctest is for testing documentation and makes an awful unit testing tool, especially for test first as practised in test driven development (TDD). Of course not everyone shares my opinions on this matter. |
Like this post? Digg it or Del.icio.us it. Looking for a great tech job? Visit the Hidden Network Jobs Board.
Posted by Fuzzyman on 2009-06-15 15:38:42 | |
Categories: Python, Hacking Tags: testing, generators, unittest, nose, py.test
Gadgets: Samsung SSD, Sharkoon SATA Adaptor, Mimo USB Monitor and Powermate USB Volume Knob(!)
Over the last few months I've bought a few new gadgets, and they're well overdue a review; so here goes.
As I'm sure you're aware Solid State Drives are hard drives using flash memory instead of mechanical disks; this eliminates the need for spin up, plus makes seek times and data rates potentially much faster and power consumption less. I wanted this for my Apple Macbook Pro, which only had a 120gig hard drive. Advantages for me would be a bigger hard drive, a faster hard drive, and through less heat / power a longer battery life as well.
Fitting it was a royal pain in the *ss. I followed the instructions from this article: Upgrade Your MacBook Pro's Hard Drive. They're pretty good, the only place I deviated from them was that once I got inside my Mac the bluetooth module wasn't on top of the hard drive I was removing. This was a good thing.
The hardest part was levering the keyboard top panel from off the innards. This really didn't want to come off, and it is attached by a ribbon cable to the motherboard so you can't be too violent in your attempts to pry it free. It came eventually. Scraping the ribbon cable that is glued to the top of the existing drive free is also slightly nerve-wracking.
Choosing an SSD is almost as painful as fitting. The current crop of drives are the first that are within the realms of affordable (although still expensive), but many of them suffer from real performance issue once you have written a certain amount of data (random write access becomes far slower than even normal hard drives). This AnandTech Article is essential reading on the subject. It was written before the PB22 came out, and the conclusion it came to is that only the OCZ Vertex and the Intel X25-M are worth having. From what I've read the PB22 doesn't suffer the same problems that plague the earlier drives and it is cheaper than both the Vertex and the X25-M so I decided to go for it.
And as for performance, well. XBench reported (results here) an average of 3x faster than a standard Macbook Pro on all the drive benchmarks. The difference in general is noticeable but perhaps not overwhelming. The most striking change was in launching Microsoft Office for the Mac; it launched in about 3 seconds instead of 12! The disappointing thing is that starting my Windows VM (VMWare Fusion) is not much faster, although shutting it down is (which was already pretty fast). Even worse, booting my Mac up (something I don't do very often) - if you include the fifteen to twenty second freeze on start which arrived with the new drive - took about the same amount of time.
In the end, whilst trying to fix a different problem with another of my new gadgets I reset the PRAM on my Macbook, which fixed it! Now on the once a month occasion I restart my laptop it will happen really quickly. Overall the biggest difference that fitting the SSD made was that I now have a bigger hard drive. Everything is faster but possibly not enough to make it worth the cost, it seems that other than Word most of the apps I start are network or CPU bound. The downside is that after investing in the SSD I probably have to wait another couple of years before I replace my Macbook.
When I ordered the SSD I also ordered a 2.5" SATA adaptor to go with it. I asked the salesman if the adaptor would work with the SSD and he did suggest that buying an SSD and then using it through a USB adaptor didn't sound that sensible. Actually I wanted the adaptor to clone the internal drive of my Mac onto the SSD before fitting it. The nice thing about the Sharkoon is that it has connectors for SATA drives and 2.5" / 3.5" IDE drives. Like many geeks I often have random hard drives lying around and this will allow me to use them. It worked fine (without needing a driver) on Mac OS X, despite not advertising Mac compatibility. It even comes with some funky rubber sheaths for attached drives if you want to leave one connected for anything other than a short period of time.
To clone the internal drive in the laptop onto the SSD I used Carbon Copy Cloner. Cloning a 120gig drive (CCC claimed it would do a block level clone but actually did a file level clone) took hours. It was slightly worrying to see the occupied size of the new drive was about 200meg less than the original - but I imagine this is a consequence of smaller blocks on the SSD and CCC doing a file level clone. Anyway it worked fine.
Mimo monitors make a range of 800x480 pixel USB monitors. I wanted the 740 touchscreen monitor for a home media server project. The 740 was out of stock so I ended up with the 710 and the media server project got shelved (I ditched wireless for my main computer as it was sporadically unreliable and with a wired connection to the desktop no need for a separate server).
The monitor is a fantastic second monitor for my laptop but I only use it when I have a power source. Rather than see it unused I have it attached to my desktop (technically my sixth monitor) showing my twitter stream via Tweetdeck.
This photo shows the Mimo and the Powermate volume control (see below).

It turned out to be an irresistible but expensive toy, quelle surprise. Definitely useful though and in constant use, so it's fared better than some of the expensive toys I've bought in the past (Nintendo DS I'm talking to you).
Unfortunately there is a problem with the displaylink driver and the Mac OS X 10.5.7 update. Some details of the problem here and more here (apparently it is a known issue with 10.5.7 and not the fault of the driver). Uninstalling and re-installing the driver worked for me, but sometimes the display doesn't work if I restart my laptop with it plugged in (remembering to unplug it before restarting does the trick).
This was another toy. Whenever I am at my computer I almost inevitably have a movie playing and this expensive little knob is a volume control. It has much more granularity than using the keyboard to control the volume and I find it surprisingly useful. You can configure different behaviour (e.g. scrolling) for different applications, but I just use it as a volume control.
Like this post? Digg it or Del.icio.us it. Looking for a great tech job? Visit the Hidden Network Jobs Board.
Posted by Fuzzyman on 2009-06-15 13:44:01 | |
Categories: Computers, Life Tags: SSD, gadgets, USB, mimo monitor, powermate, sharkoon
Fuzzywuzzy Beard
In my last post I mentioned my fuzzywuzzy beard not once but twice. Here's a great picture of me and my fuzzywuzzy beard drawn by Scott Meyer, the creator of the Basic Instructions webcomic.

You can order your own custom avatar for $10.
Like this post? Digg it or Del.icio.us it. Looking for a great tech job? Visit the Hidden Network Jobs Board.
Posted by Fuzzyman on 2009-06-13 20:24:57 | |
Categories: Fun, Life Tags: fuzzyman, Scott Meyer, Basic Instructions, Comic
Future adventures of unittest and a mini-rant
There is a general rule that innovation doesn't happen in the standard library. Instead modules or techniques that have already proven themselves in the Python community are adopted into the standard library. This is exemplified in the standard library testing framework, unittest, which until recently stagnated whilst frameworks like nose and py.test pushed forward the boundaries of testing in Python. Features like test discovery and cleanup functions that have been brought into unittest first appeared in other testing frameworks.
The next releases of Python (2.7 / 3.1 / 3.2) will see but some great improvements to unittest but it is still far from being perfect. In the spirit of lists of things I don't like about something I do like (and before I rant) here are what I think are the main problems (or the main perceived problems) with unittest:
- It is a single monolithic file, it should be a package
- Hard to extend / write plugins
- People want to write functions not classes - unnecessary boilerplate
- No class (or module level) setUp and tearDown
- No standard mechanism for parameterized (generative) tests
Let's dig into these and see what we can do about them:
Currently unittest.py is 1760 lines of Python and test_unittest.py is 3699 lines. Frankly that's horrible and it makes unittest hard to understand and hard to maintain. Benjamin Peterson (the current Python release manager) has said he will split unittest into a package after Python 3.1 is released. If he doesn't have time for it then I'll do it.
Once you are familiar with the responsibilities of the various moving parts in unittest (TestCase, TestRunner, TestLoader, TestSuite and TestResult) it is pretty easy to extend. Unfortunately it is difficult to extend so that other people can reuse what you have done. If you write a TestResult that writes colorized output to the console and I write one that pushes results to a database then the chances are that a new project will have to choose one or the other and can't use both. It isn't impossible, and there are projects that do it very well, but there is no standard plugin mechanism or culture of sharing extensions. I'd like to look at whether a plugin system with some compatibility with the nose / py.test plugins is plausible or just a pipe dream.
Haha - I shake my fuzzywuzzy beard at you in bewilderment. Do you people dislike OOP, the class statement is mere boilerplate to you, I mumble incoherent French obscenities in your general direction. (Did you know the French acronym for object-oriented programming is POO ?). I find grouping tests by class very useful. Although nose and py.test allow you to organise tests as module level functions most people I know still use classes to group tests. In fact unittest does provide a way for you to write test functions rather than classes - but I'm not telling you what it is.
This is a double edged sword. For expensive fixtures (like big databases) it is a slow pain to have to recreate them for every test (in setUp). What you think you want is to be able to have a class or module level setUp where it is done once and shared between tests. Nose and py.test give you what you think you want (which in general is a good policy I guess), but this does violate test isolation. When unittest runs tests it instantiates the TestCase separately for every test it runs; every test is run in a fresh instance unsullied by previous tests. You can already work round this by creating class attributes instead of instance attributes of course. The Twisted test framework (built on unittest) used to provide for shared fixtures with setUpClass / tearDownClass. When this was discussed recently on Testing in Python, people had this to say of them:
Jonathan Lange: It's worth treading carefully where Shared Fixtures are concerned. They tend to lead to Erratic Tests and Fragile Tests.
Twisted added setUpClass and tearDownClass to Trial and they have caused us nothing but grief. To be fair, they were added before classmethod was added to Python, which caused much of the pain.
Andrew Bennetts: I agree with Jonathan here. Twisted's setUpClass/tearDownClass were terrible, for the reasons he gave.
They both recommended instead the testresources library for shared resources with unittest. Other than being GPL this looks like a useful library. At some point I'll investigate this and consider how shared fixtures might be usefully added to unittest.
- And this topic I leave for a blog entry all of its own...
Now for the rant... I have a lot of admiration for nose and py.test. They have helped popularise Python testing and brought many new and interesting ideas. They haven't had features compelling enough to make me jump from unittest (and until recently IronPython compatibility has been an issue for much of what I've done) but I can understand why many new projects use them.
Something that p*isses me off about them though is the way that their evangelists extoll their virtues by denigrating unittest, and in ways that I think are bizarre. If the library / framework that you like so much so good then let it stand on its own two feet and not denigrate alternatives. The latest person to do this, and so raise my ire, was the otherwise sound-and-sensible Brandon Craig Rhodes in the first part of his Python testing frameworks article on the IBM developerworks site. Of a short unittest example he says:
Look at all of the scaffolding that was necessary to support two actual lines of test code! First, this code requires an import statement that is completely irrelevant to the code under test, since the tests themselves simply ignore the module and use only built-in Python values like True and False. Furthermore, a class is created that does not support or enhance the tests, since they do not actually use their self argument for anything. And, finally, it requires two lines of boilerplate down at the bottom, so that this particular test can be run by itself from the command line.
The scaffolding that he is talking about is two lines at the start of the test module and two lines at the end. One of those lines is the import of unittest (irrelevant??) and the other is the class definition. As I mentioned, most serious users of nose / py.test still write class structured tests and any serious testing module is going to import a whole lot of stuff - including objects from your testing framework. This criticism seems vacuous and unrepresentative of any serious testing environment (four lines may be a lot when your whole testing code is less than ten lines - but in the real world this is less than a non-issue). Unfortunately many of the criticisms of unittest in articles on alternative testing frameworks seem to state this as one of the most important advantages of switching...
Brandon also says later of the assert methods "First, calling a method hurts readability". This I can understand but just plain disagree with I guess and I don't think unittest would be improved by providing a host of assert functions to import rather than having them as methods on TestCase (see my previous comments on OOP and fuzzywuzzy beards). I'm also dubious of the heavy magic done by nose / py.test to support useful error messages when using plain asserts. This magic has portability implications to other implementations like Jython and IronPython.
The other two (whole) lines of boiler-plate that Brandon bemoans are the lines necessary to make a test module executable on its own:
if __name__ == '__main__':
unittest.main()
This is true, up until Python 2.6 these two (whole) lines are needed. However, test discovery and better command line options have been added to unittest in Python 2.7. If we're going to be precise about the matter though, you may need these two lines in your test module under unittest - but in a fresh install of Python the whole nose module itself becomes unnecessary boilerplate compared to unittest...
I've exchanged emails with Brandon about this, and he suggested we both blog about it - so I eagerly await his response.
Like this post? Digg it or Del.icio.us it. Looking for a great tech job? Visit the Hidden Network Jobs Board.
Posted by Fuzzyman on 2009-06-13 20:06:12 | |
Categories: Python Tags: unittest, nose, py.test, rant, testing
New in unittest: Test Discovery and the load_tests protocol for Python 2.7 and 3.2
A feature that has long been missing from unittest, is automatic test discovery. This alone is a major reason why people move to alternative frameworks like nose and py.test.
Test discovery is now in unittest, but it missed version 3.1 of Python (which is now at release candidate) and will be in versions 2.7 & 3.2.
Automatic test discovery is where you don't need to provide your own test collection machinery, but have a tool that can automatically find and run all the tests in a project. So long as your tests are compatible with the new test discovery (see below) you can now do:
python -m unittest discover
This will find all the test files that match the default pattern ('test*.py') and run all the tests they contain. It also has a customization hook called load_tests which enables you to customize which tests are loaded and run.
This test discovery is not as sophisticated as the discovery in nose or py.test, but it is a good start and will be sufficient for many projects. The system is as follows...
Discovery from the command line takes three optional parameters (plus the -v switch to run tests verbosely) which can be passed in by position or by keyword. The parameters are the directory to start discovery (defaults to the current directory), the pattern for matching test modules (defaults to 'test*.py') and the top-level directory of your project (defaults to whatever the start directory is):
python -m unittest discover myproject/tests/ '*test.py' myproject/ python -m unittest discover -s myproject/tests/ -p '*test.py' -t myproject/ python -m unittest discover -v -p '*test.py'
All your tests must be importable from the top level directory of your project (they must live in Python packages). The start directory is then recursively searched for files and packages that match the pattern you pass in. Tests are loaded from matching modules, and all tests run.
Discovery is implemented in the TestLoader class as the discover method. It delegates to loadTestsFromModule to load all tests after discovering and importing all modules that match the pattern provided. The actual signature is:
TestLoader().discover(start_directory, pattern='test*.py', top_level_dir=None)
The customization hook is implemented in loadTestsFromModule and is available to all systems that uses the standard loader, not just during discovery.
Iff a test module defines a load_tests function then loadTestsFromModule will call this function with loader, tests, None. This should return a suite.
An example 'do nothing' implementation of load_tests for a test module would be:
return tests
One use case would be to exclude certain TestCase subclasses from being used (if they are abstract base classes for other tests) during a test run.
If a package directory name matches the pattern you pass into discovery and the __init__.py contains a load_tests function then it will be called with loader, tests, pattern. No further discovery will be done into the package, but because it is passed the pattern as an argument it is free to continue discovery itself. A do nothing load_tests for a package is:
if pattern is None:
# if loaded as a module just return the normal tests
return tests
suite = TestSuite()
suite.addTests(tests)
# continue discovery from this directory
this_dir = os.path.dirname(os.path.abspath(__file__))
suite.addTests(loader.discover(this_dir, pattern))
return suite
The loader stores the top level directory it was originally called with specifically for this use case. load_tests should not pass in a new top level directory to the existing loader but create a new loader if it wants to do this.
Discovery does not follow the __path__ attribute of packages / modules and only looks at the filesystem.
Both the load_tests protocol and test discovery are useful new features in unittest. I expect test discovery in particular to mature, but it is definitely already usable. The implementation uses os.relpath; so the current trunk version of unittest.py can only run on Python 2.6 or more recent. At some point I'll backport it to work with Python 2.5 / 2.4.
Like this post? Digg it or Del.icio.us it. Looking for a great tech job? Visit the Hidden Network Jobs Board.
Archives
Counter...




