Python Programming, news on the Voidspace Python Projects and all things techie.
Python Virus Bites Apple
As you may have heard, Apple just got bitten by a Windows virus that managed to get itself onto some video IPods.
The funny twist to this bizarre story is Apple's response to the embarrassing episode :
As you might imagine, we are upset at Windows for not being more hardy against such viruses...
Actually, the full quote (allegedly) is "As you might imagine, we are upset at Windows for not being more hardy against such viruses, and even more upset with ourselves for not catching it", but quoting the first part on its own is much funnier.
rest2web: Debian and Other Stuff
The latest rest2web release seems to have gone well, or at least no one has discovered any embarrassing bugs yet.
Martin Krafft is still hopeful that the latest release will make it's way into the forthcoming new release of Debian. Whether it does or not, this release addresses a couple of issues with the Debian rest2web Package.
I also stumbled across the following note about the Arlington Python Sprint on September the 23rd :
Andrew Kuchling worked on using a new templating system, rest2web, for the python.org website. The result is still preliminary, but promising.
I've now been programming with Python for just over three years, having done no programming for ten years before that (when I cut my teeth on BBC Basic and 68000 assembly language).
I started writing rest2web over eighteen months ago, and as is the way with projects like this, if I were to start it today the architecture would be very different (but the feature-set would be very similar). Without a framework of tests wholescale refactoring would be difficult.
Some parts of rest2web are necessarily complicated. For example, you can have a page in a directory where the target file is actually placed in a different location (but will be included in the index page for that directory). I use this feature in Voidspace. rest2web will generate relative paths from the target location to all the other pages you may reference from that page, like in the sidebars.
You can also have pages in a variety of source encodings, and even in a variety of target encodings. If you use access parts of a page from another page, like the description for example, rest2web will ensure that this is in the target encoding of the page being rendered (by storing page contents and metadata in unicode).
Despite the complexity I know the code fairly well and generally enjoy working with it.
The refactoring I would like to do is to make rest2web a two pass process. This would give you access to the whole site structure when rendering any page. At the moment a page only has access to the current directory and any directories above it in the site's 'tree'. This would also allow for page details to be cached, so that it could tell which pages had changed. This would allow it to be much faster.
One of the reasons I like rest2web is possibly one of the reasons it may not appeal to some people. It uses embedded Python code as it's templating system. This means there is no need to learn a new templating language. The page details and site structure are available to you within the page namespace when the template is being rendered. There are also standard functions you can use, which output sidebars and navigation links. The data structures they work with are available to you if you would rather roll your own.
Two easy wins for improving rest2web would be allowing other templating systems, and allowing the use of markups other than ReStructured Text and HTML. As both the templating and markup rendering are done by individual function calls, these would be almost trivially easy to do.
Now that this release is out of the way I have to finish the Movable Python Documentation and next release (nearly ready, honest), and then a rather interesting IronPython project (more details to follow...). After that I may find the time to work on rest2web again, amongst other things.
Well, over recent weeks we've added some important new features and have noticed that performance has gradually slowed. Key calculations, for moderately complicated datasets, that we figure ought to be possible in about a second are (well... were) taking about five to six seconds.
We have an important demo (to the board of directors and investors) on Wednesday, so performance improvements got bumped to the front of this iteration.
I wasn't working on it this time, but it is another interesting tale that will come as no surprise no seasoned Python programmers.
At Resolver we are certainly not believers in premature optimisation. We keep things elegant and clean, but never worry about performance of code until it becomes a problem. We have a functional test in place that tests basic performance. If we bork anything, the build fails and the code has to be refactored before checking in. This user story adds 'further performance test' to the functional tests.
To cut to the chase, the guys got calculation times down close to the one second we were looking for.
The code involved in this functionality is heavily recursive, with some fairly complex algorithms. All the speed improvements were gained by switching to use the following normal Python idioms:
We had some someString += newString.
We switched these to using someString = ''.join(stringList).
We were using someList += someItems to concatenate lists.
We switched these to use someList.extend(someItems)
Finally, we were constructing lists of items and later turning them into a set to test for membership.
We changed this to use a set from the beginning.
Fredrik Lundh points out that someList += someItems should be even slightly faster than someList.extend(someItems), because it works in place and uses operator dispatch rather than a method lookup.
I think however that we had some places (fairly old code) which used someList += [item]. We probably saw some of the speed benefit switching these to use append.
(My colleague who did the optimisations contends that we did see a speed up when switching to using extend in IronPython.)
So nice straightforward performance benefits. Despite our usual mantra of not worrying about performance until we need to, we'll be sticking to these idioms in future. And yes we did know about potential string concatenation issues, but until now we'd never been bitten by it. It is interesting that this has been 'fixed' in CPython 2.4, but is still a major performance bottleneck in IronPython.
Working on this user story has identified two further changes we can make for performance improvements. The first will specifically work with the sorts of data we have in this test. It is fairly easy to implement, but is too much just before the demo.
The next change will mean a major change in the way Resolver works. It will effectively implement a 'partial recalculation on demand' that we have known we've been heading towards for a while; but now we can see the way fairly clearly.
Gradually evolving our architecture means that we can now see how this change will fit in with the rest of the Resolver structure. When we finally go for it, we have the whole of our functional tests to tell us when we have got it right.
|||Which in case I haven't mentioned it before, is a desktop application being developed by Resolver Systems using IronPython.|
rest2web 0.5.0 Final
At last rest2web 0.5.0 Final is released.
Quick download links:
This release has several bugfixes, as well as some interesting new features, over previous releases.
Important changes since the last release (0.5.0 Beta 1) include:
- All the standard macros are now built-in. There is no need for a separate macro file if you are only using the standard ones.
- A new 'skiperrors' config file / command line option. Errors in processing a file can now be ignored and rest2web will attempt to continue processing.
- A config file is no longer required in force mode. (The current directory is used as the source directory and html output is put into a subdirectory called 'html'.)
- The restindex and uservalues block may now be in a ReST comment. This means that rest2web source documents with a restindex can still be valid ReStructured Text documents.
What is rest2web
rest2web is a tool for creating websites, parts of websites, and project documentation.
It allows you to keep your site contents in ReStructured Text or HTML.
Using a flexible templating system, using embedded Python code for unlimited flexibility and no new templating language to learn, it can then output the HTML for your site.
rest2web is extremely flexible, with many optional features, making it suitable for building all kinds of websites.
See the main page for links to some of the sites built with rest2web.
What's New ?
You can find the full changelog: here.
Paths in the file keyword and in the config file now have '~' expanded. This means they can use paths relative to the user directory. (Plus the 'colorize' and 'include' macros.)
Added 'skiperrors' config file / command line option. Errors in processing a file can now be ignored and rest2web will attempt to continue processing.
Fixed bug where non-ascii uservalues would blow up.
There was a bug in handling tabs in embedded code. This has been fixed.
The macro system has been revamped. All the standard macros are now built in as default macros. The modules needed by the default macros are also now built into rest2web. You can still add your own macros, or override the default ones, by supplying an additional macros file.
Macro Paths section added to the config file for configuring the default macros smiley and emoticon.
The initial message printed by rest2web has been changed to INFO level, so that it is not displayed by the -a and -w verbosity levels.
The namespace and uservalues for each page are now available to the macros, using global variables uservalues and namespace (dictionaries). This means you can write macros that are customised for individual pages.
A config file is no longer required in force mode. (The current directory is used as the source directory and html output is put into a subdirectory called 'html'.)
The restindex and uservalues block may now be in a ReST comment. This means that rest2web source documents with a restindex can still be valid ReStructured Text documents.
Fixed imports in the gallery plugin. (Thanks to Steve Bethard.)
Changed over to use the latest version of StandOut.
rest2web now exits with an error code corresponding to the number of warnings and errors generated.
Errors and warnings are now output on sys.stderr.
This work is licensed under a Creative Commons Attribution-Share Alike 2.0 License.