Python Programming, news on the Voidspace Python Projects and all things techie.
Release: ConfigObj 4.6.0 and Validate 1.0.0
ConfigObj and Validate development is now done in Google Code project and subversion repository. Please post any bug reports or feature requests on the project issue tracker.
The best introduction to working with ConfigObj, including the powerful configuration validation system, is the article:
ConfigObj is a simple to use but powerful Python library for the reading and writing of configuration (ini) files. Through Validate it integrates a config file validation and type conversion system.
Features of ConfigObj include:
Nested sections (subsections), to any level
Multiple line values
Full Unicode support
String interpolation (substitution)
Integrated with a powerful validation system
- including automatic type checking/conversion
- and allowing default values
- repeated sections
All comments in the file are preserved
The order of keys/sections is preserved
Powerful unrepr mode for storing/retrieving Python data-types
Release 4.6.0 fixes bugs and adds new features, particularly making configspec handling more flexible.
The full changelog for ConfigObj 4.6.0 is:
- Pickling of ConfigObj instances now supported (thanks to Christian Heimes)
- Hashes in confgspecs are now allowed (see note below)
- Replaced use of hasattr (which can swallow exceptions) with getattr
- __many__ in configspecs can refer to scalars (ordinary values) as well as sections
- You can use ___many___ (three underscores!) where you want to use __many__ as well
- You can now have normal sections inside configspec sections that use __many__
- You can now create an empty ConfigObj with a configspec, programmatically set values and then validate
- A section that was supplied as a value (or vice-versa) in the actual config file would cause an exception during validation (the config file is still broken of course, but it is now handled gracefully)
- Added as_list method
- Removed the deprecated istrue, encode and decode methods
- Running test_configobj.py now also runs the doctests in the configobj module
- Through the use of validate 1.0.0 ConfigObj can now validate multi-line values
As a consequence of the changes to configspec handling, when you create a ConfigObj instance and provide a configspec, the configspec attribute is only set on the ConfigObj instance - it isn't set on the sections until you validate. You also can't set the configspec attribute to be a dictionary. This wasn't documented but did work previously.
In order to fix the problem with hashes in configspecs I had to turn off the parsing of inline comments in configspecs. This will only affect you if you are using copy=True when validating and expecting inline comments to be copied from the configspec into the ConfigObj instance (all other comments will be copied as usual).
If you create the configspec by passing in a ConfigObj instance (usual way is to pass in a filename or list of lines) then you should pass in _inspec=True to the constructor to allow hashes in values. This is the magic that switches off inline comment parsing.
As the public API for Validate is stable, and there are no outstanding issues or feature requests, I've bumped the version number to 1.0.0. The full change log is:
- BUGFIX: can now handle multiline strings
- Addition of 'force_list' validation option
You should be able to install ConfigObj (which includes Validate in the source distribution on PyPI) using pip or easy_install.
Python is not always appropriate in every circumstance. As if proof was needed two recent news stories confirm it:
Sod This! Another Podcast
Episode 3 is now up, and it's an interview with me on dynamic languages in general and IronPython in particular. (Before becoming a .NET programmer Gary was a smalltalk developer.)
The interview took place during the BASTA conference in Germany; in a bar, so the audio starts of a bit rough but improves as the interview progresses. I even reveal my mystery past and what I did before programming in Python.
Oh, and just for the record - I was the first Microsoft dynamic languages MVP.
Setting Registry Entries on Install
On Windows Vista setting registry entries on install is hard. It is likely that Microsoft don't care - the official guidance is to set entries on first run and not on install, but there are perfectly valid reasons to want to do this.
The problem is that for a non-admin user installation requires an administrator to authenticate, and the install then runs as the admin user not as the original user. So if you set any registry keys using HKEY_LOCAL_USER then they will be set for the wrong user and not visible to your application when the real user runs it.
The answer is to set keys in HKEY_LOCAL_MACHINE, which is not an ideal answer but at least it works. The problem comes if you need write access to those registry keys; the non-admin user doesn't have write access to HKEY_LOCAL_MACHINE. When you create the key you can set the permissions though,but you need to know the magic incantations. In IronPython (easy to translate to C# if necessary) the requisite magic to allow write access to all authenticated users is:
from System.Security.AccessControl import RegistryAccessRule, RegistryRights, AccessControlType
from System.Security.Principal import SecurityIdentifier, WellKnownSidType
REG_KEY_PATH = "SOFTWARE\\SomeKey\\SomeSubKey"
key = Registry.LocalMachine.CreateSubKey(REG_KEY_PATH)
ra = RegistryAccessRule(SecurityIdentifier(WellKnownSidType.AuthenticatedUserSid, None),
rs = key.GetAccessControl()
Of course the next issue is what happens when you make your application run as 32bit on a 64bit OS (to workaround in part the horrific performance of the 64bit .NET JIT). Hint, the registry keys will have been created in the WOW6432Node. If you want to use the standard locations and share between 64 and 32 bit applications then you need to look into reflection (which copies keys between the 32 and 64 bit registry trees) with RegEnableReflectionKey (although it's not entirely clear whether you need to enable or disable reflection to share keys, but thankfully I haven't yet needed to experiment with this).
If this wasn't all enough fun for you, under some circumstances you can end up with registry virtualization. This is where your registry keys end up in an entirely separate registry hive called VirtualStore under the root node.
You can find a reference on virtualization (which can also cause file locations to be virtualizaed - making them visible to some applications and invisible to others) on this page.
In our case deleting the VirtualStore restored sanity.
Most of the details in this entry only apply to 64 bit Windows Vista.
More Fun at PyCon 2009
PyCon 2009 was awesome fun, as many others have charted. The highlights of the conference were, as always, meeting and mixing with such a rich combination of clever and fun people - all of whom I have Python in common with. It was a mix of new friends and old friends, far too many to mention all of them.
The Hyatt hotel in which PyCon was held in had the rooms on all four external walls, interconnected with a grand structure that someone nicknamed the fragatorium:
The tenth floor made an ideal launching point for the balsa aeroplanes being given out by the Wingware guys.
Mr. Tartley getting ready to launch:
Unfortunately I'm rubbish at remembering to take photos, about the only genuine conference photo I took was of the VM panel discussion.
During the sprints there was much ridiculousness around the Django Pony, that somehow ended up with the domain ponysex.us being hosted on my server! As an added bonus I was elected to membership of the Python Software Foundation (PSF) during the conference. This means two things in practise; a new opportunity to bikeshed on the PSF mailing list and new opportunities to volunteer for extra work! D'oh.
Distributed Test System at Resolver Systems
Just for the record, here is a rough outline of how we do distributed testing over a network at Resolver Systems. It is a 'home-grown' system and so is fairly specific to our needs, but it works very well.
The master machine does a full binary build, sets up a new test run in the database (there can be multiple simultaneously), then pushes the binary build with test framework and the build guid to all the machines being used in the build (machines are specified by name in command line arguments or a config file when the build is started). The master introspects the build run (collects all the tests) and pushes a list of all tests by class name into the database.
When the zip file arrives on a slave machine a daemon unzips and deletes the original zipfile. Each slave then pulls the next five test classes out of the database and runs them in a subprocess. Each test method pushes the result (pass, failure, time taken for test, machine it was run on, build guid and traceback on failure) to the database. If the subprocess fails to report anything after a preset time (45 mins I think currently) then it kills the test process and reports the failure to the database. Performance tests typically run each test five times and push the times taken to a separate table so that we can monitor performance of our application separately.
The advantage of the client pulling tests is that if a slave machine dies we have a maximum of five test classes for that build that fail to run. It also automatically balances tests between machines without having to worry about whether a particular set of tests will take much longer than another set.
A web application allows us to view each build - number of tests left, number of passes, errors and failures. For errors and failures tracebacks can be viewed whilst the test run is still in progress. Builds with errors / failures are marked in red. Completed test runs with all passes are marked in green. Easily being able to see the total number of tests in a run makes it easy to see when tests are accidentally getting missed out.
A completed run emails the developers the results.
The web page for each build allows us to pull machines out whilst the tests are running. If a machine is stopped then it stops pulling tests from the database (but runs to completion those it already has).
Machines can be added or re-added from the command line.
We have a build farm (about six machines currently) typically running two continuous integration loops - SVN head and the branch for the last release. These run tests continuously - not just when new checkins are made.
This works very well for us, although we are continually tweaking the system. It is all built on unittest.
The system that Jesse Noller will have as its foundation a text based protocol (XML or YAML) for describing test results. These can be stored in a database or as flat files for analysis and reporting tools to build on top of.
For a test protocol representing results of test runs I would want the following fields:
- Build UUID
- Machine identifier
- Test identifier: typically in the form package.module.Class.test_method (but a unique string anyway)
- Time of test start
- Time taken for test (useful for identifying slow running tests, slow downs or anomalies)
- Result: PASS / FAIL / ERROR / SKIP
Anything else? What about collecting standard out even if a test passes? Coverage information?
We sometimes have to kill wedged test processes and need to push an error result back. This can be hard to associate with an individual test, in which case we leave the test identifier blank.
Extra information (charts?) can be generated from this data. If there is a need to store additional information associated with an individual test then an additional 'information' field could be used to provide it.
A lot of the other discussion on the mailing list has been around changes and potential changes to the unittest module - changes that started in the PyCon sprint. I'll be doing a series of blog posts on these in the coming days.
Essential Programming Skills: Reading and Writing
As a programmer there are two basic skills vital to your productivity: how fast you can type and how fast you can read.
On typing, Steve Yegge said it best of course in Programming's Dirtiest Little Secret.
I often mock Mr. Tartley for being a hunter pecker, but he can really type quite fast with his two fat fingers. I taught myself to touch type with Mavis Beacon back when I was selling bricks and found it enormously freeing. Being able to type without having to look at the keyboard makes a massive difference.
There are a host of tools that will help you learn or practise touch typing. I've just discovered (via Miguel de Icaza) a fun web based one, that you can fire up at any time. You race against other players typing short passages from books, with visual cues when you make mistakes. It even lets you setup private games to race against a set of friends. My only criticism is that there isn't enough punctuation to really practise typing for programmers (programmer specific version anyone?):
The combination of competition, short doses and interesting passages make it fun, addictive and actually useful. My average WPM is 52 at the moment, but I reckon if I practise a few times a day I'll pick up speed.
The correlating skill essential for programmers needing to browse countless pages of documentation and information from blogs that may or may not be useful is speed reading. The following tool is great for practising, but I'm also finding it useful for quickly reading long passages of text.
It shows you the text a line at a time, moving the focused part quickly (at a speed you can configure and control from the keyboard) from left to right. This mirrors (unsurprisingly) the way I skim read blogs etc. The problem is that I often involuntarily skip passages whilst skim reading - this tool is good for practise but also helps me to read quickly without missing bits.
This work is licensed under a Creative Commons Attribution-Share Alike 2.0 License.