Upgrading Python Version

Python Modules and Libraries

Author: Michael Foord
Contact: fuzzyman AT voidspace DOT org DOT uk
Date: 2004/12/01
Doc Revision:5, 2004/12/20
Copyright: This document is in the public domain.
Homepage:Voidspace Python Homepage
Voidspace Python Mailing List:
 Pythonutils Mailing List

Contents

The New Thing

Every eighteen months or so, a new version of Python comes out. They represent a lot of hard work, and in the case of Python 2.4 - a heck of a lot of arguing about decorator syntax. I first started to learn Python about eighteen months ago, which meant the change from Python 2.2 to 2.3 had just happened. I think I did have Python 2.2 installed, maybe even for as long as several days ! This is the first time I've really had to contend with a major version upgrade.

Now this new version of Python is interesting. There are not a lot of changes to the core language, and few major additions to the standard library. Decorators and cookielib are the big exceptions to this of course [1]. This is quite deliberate; the main changes in Python 2.4 are a host of bug fixes, improvements and optimisations. A good example is that string concatenation is now several times faster in Python 2.4 (well in CPython 2.4 anyway). This means most code will run faster, and better, in Python 2.4 than it did in Python 2.3.

If Only it Was That Easy

Now, if only it was as simple as just uninstalling Python 2.3 and firing up the fancy new Python 2.4 installer program. Like most python users, I have acquired a whole raft of extension modules. Over the last eighteen months I have accumulated, and experimented with, multi-megabytes of python libraries. These do everything from virtually unbreakable encryption (pyCrypto), to visually displaying springs and molecules (KineticsKit with Visual Python). This plethora of high quality, free, expansions is a real testimony to the open source movement and the python language. At least half of the extensions I've acquired I just haven't had the time to properly explore. Herein lies the rub; when a Python extension is written in C and compiled, it is linked against the python dll for that version of python. So any Python extension that is at least partly written in C, will need a new version compiled for Python 2.4. This sort of problem is the reason we code in Python in the first place right ? Any modules written in pure python can just be dropped into the site-packages folder, and will (for will read should) run unchanged. The old .pyc files have to be deleted first, python bytecode changes with every major version change and Python 2.4 is no exception. (The bytecode has been the target of some interesting optimisations this time round).

So it looks like it's time for me to examine what libraries and extension modules I have. Which of them do I really use ? Which of those will I need new versions of ? The results surprised me. The set of modules that I actually use is smaller than I thought it would be. Unfortunately, some of the ones I'm interested in (like wxPython) haven't yet produced versions compiled for Python 2.4. I'm sure it won't be long though. The advantage of maintaining a list like this is, that if I keep it up to date, it ought to make Python version migration a less onerous task in the future. A little early for a spring clean maybe, but a good opportunity to clear out my cluttered site-packages folder.

Hurrah for Windoze (tm)

This of course is a purely windoze (tm) related problem. Python for windows is compiled using the microsoft compiler, a non-free tool [2]. On UNIX like systems, Python and extensions are usually installed from source and compiled in-situ using gcc. Installing a new module from C source is entirely normal on these platforms. Under python 2.3 it was possible to configure distutils to use the mingw version of gcc. This meant that many Python extensions could be installed from their C-source code on windows just as easily as on a Linux machine, yup it's true. Python 2.4 is compiled with a new version of the microsoft compiler. Microsoft Visual C 7.1 , which is a .NET compiler. Unfortunately, the rumours I've heard on comp.lang.python say that the old techniques for configuring distutils are out of date and won't work. The python community seems to be still working out whether their is a generic answer to this question (a solution that will work straightforwardly with distutils). Frequently the question comes up - why isn't Python for windows built using gcc ? This would make the magic compiler chain incantations for building extensions a lot simpler (not to mention free). It seems the answer is two-fold. First of all, python is built by volunteers. It will be built using whatever tools those volunteers are happiest using. If you want to compile Python using gcc, go ahead. The second answer, is that the microsoft compiler produces tighter code. [pauses to await flames from Lunix bigots :-) [3] ].

Magic Compiler Chain Incantations

Many programmers come from a background where C is the first language they learn. In this case the magic compiler chain incantations are less of a mystery. Although I've dabbled with C, the whole makefile-linker-compiler business still seems a very black art. Under python 2.3 I've configured distutils to use gcc from the mingw toolset. Following the instructions at http://sebsauvage.net/python/mingw.html it wasn't too complicated and worked beautifully. It was a real joy to see previously inaccessible extensions install seamlessly. As I mentioned, recompiling extensions for Python 2.4 appears to be slightly more complicated for Python 2.4. Microsoft have actually made their optimising compiler available for free, but configuring distutils to use it requires a bit of hacking. There is a good resource on this to be found at http://www.vrplumber.com/programming/mstoolkit/. It would be interesting to get this setup anyway, but the downloads are a bit hefty ! The instructions at sebsauvage imply that the mingw path ought to work for 'future' versions of python - but testing this risks breaking my current install. sigh. I will try it, probably mingw first - but not today. [4]

The Problem With Core Language Changes Is....

I like Python 2.3. It works and it's nice. People complain about added complexity of features of recent upgrades. For example list comprehensions have often been nicknamed list incomprehensions - but I like em [5]. Some people claim that python 1.5.2 is the 'one true python'. Personally I couldn't live without new style classes though.

A lot of language changes you can code around, for example in versions of python that True and basestring are undefined, you can simply test for a NameError and define them. Decorators are a different kettle of fish - the interpreter will detect a syntax error and won't run any of the code - even if you explicitly test for python version so that the decorators are never used in Python < 2.4. Conditional imports using sys.version would get round that I guess. (You could even dynamically generate the module source before importing !).

Anyway - what I'm trying to say is that my website hoster - http://www.sapphiresoft.co.uk - has python 2.3.4 installed. This means in a lot of my code I can't use decorators even though I quite like them and quite like the syntax. bah - The alternative of course is not having any improvements. Which could lead onto a digression about how computer code is the least permanent of all creative art forms. Great literature can survive hundreds of years, great code is lucky to survive ten years [6].........

So Wadda Ya Want ?

Following is the list of Python modules that I have installed and need to re-install when I migrate to Python 2.4. Your list will probably look very different of course. Some of the ones below may be in the 'wrong category' - pure may not be pure and I may be mistaken as to which ones have C components.

Essential Libraries and Tools

These are libraries and tools that I use regularly. If I can't upgrade these, I will have to very seriously consider whether I want to upgrade at all. (or at least yet..)

Pure Python

The following ones are written in pure python. Re-installing these ought to be very simple indeed.

  • ClientCookie (this is replaced by cookielib - but I need ClientCookie for testing purposes)
  • docutils
  • epydoc
  • XMLObject (now called easeXML I believe)
  • SPE (as this module depends on wxPython I should strictly include wxPython here, but I haven't yet explored wxPython directly... on the list of things to do of course)

Includes C extensions

The following modules are ones that I use regularly. As they are all at least partly written in C I'll need a new version specifically for Python 2.4. I include here any that I'm uncertain of whether they are pure python or not.

  • pyCrypto
  • ctypes
  • PIL
  • psyco
  • py2exe
  • pythonwin

Important (But not essential)

These are libraries and tools that I use occasionally or would like to explore a bit more. I can live without them, but would rather not have to.

Pure Python

  • pychecker
  • Ipython
  • WCK

Includes C Extensions

  • wxPython
  • Pyrex
  • pygame
  • scipy (for weave)
  • pmw
  • readline (the Gary Bishop one)
  • Quixote (minor C component and version 2.0 imminent anyway)
  • uTidylib

Interesting

These are extensions that look interesting, but I've never really got round to using or exploring them.

  • venster
  • visual
  • kinetics kit
  • mx
  • gnosis
  • numeric
  • pyXLwriter
  • ClientForm
  • pySQLlite
  • Wcurses

I won't bother to list the set of packages that I had installed but won't be re-installing. I'm still waiting for Psyco and PyCrypto to be recompiled before I make the change. Hopefully it won't be very long. UPDATE Psyco 1.3 is now available - including a binary for Python 2.4. Check out http://psyco.sourceforge.net . The PyCrypto binaries are provided by volunteers (for twisted I think) and so may take a while longer - maybe I'll have to try sorting it out myself. UPDATE I did sort it and am even hosting a Python 2.4 windows installer over at Voidspace. Check out the Voidspace Python Homepage.

A Late Conclusion

Like all good stories this one has a happy ending [7]. After much dithering I decided to take the plunge and install python 2.4. In order to not be stuck without extensions I needed I also followed the vrplumber instructions to install the microsoft C compiler. Generally I'm very pleased. It's nice to have done a spring clean of my 'site-packages' folder and python 2.4 has a good feel to it. Unfortunately compiling the extensions I needed wasn't so straightforward.

This is the list of extensions that failed to compile just using distutils :

Luckily it wasn't all bad news. Frederik Lundh did a release of PIL compiled for windows. py2exe and pythonwin were already available. PyCrypto compiled fine, as did a few of the other simpler extensions. I didn't even attempt to compile wxPython, but just at the right time a new binary was uploaded to sourceforge. I was the first to download it ! I'm now happily running SPE under python 2.4, and plotting my current project 'Movable Python' [8] with it.


Footnotes

[1]Or at least the big exceptions that matter to me. They're not the only exceptions of course.
[2]For a given value of 'free' of course.
[3]

http://www.osnews.com/story.php?news_id=5602&page=3 is yet another benchmark. Of course benchmarks are generalisations, and all generalisations are wrong. It does indicate a few things though :

  • For intensive number crunching, pure python is rubbish.
  • Psyco is much better, but still rubbish.
  • Java, also bytecode interpreted but static rather than dynamic [9], is loads faster than Python.
  • For I/O operations (largely operating system calls and waiting for hardware I guess) Python has little overhead against C.
  • Finally, gcc has worse results than the microsoft compiler.

(I know, I know - numarray for python and optimisations for everything else - but that's a double edged sword). Hmmm.. don't worry Python is still my one true language, but it isn't always the right tool for the job or we'd never have extensions written in C.

[4]See this comp.lang.python discussion for more discussion on the change of compiler for the windows distribution of python
[5]Did you know you ought to use list comprehensions instead of map if you use psyco http://psyco.sourceforge.net/psycoguide/tutknownbugs.html
[6]Not counting all the Fortran that's still floating round of course.. ;-)
[7]This particular happy ending was added on the 20th December.
[8]Sourceforge project and announcement will follow shortly.
[9]A footnote from a footnote - heck. My favourite sane references on static vs. dynamic are from Bruce Eckel : http://www.mindview.net/WebLog/log-0052 and http://www.mindview.net/WebLog/log-0066