Python Programming, news on the Voidspace Python Projects and all things techie.
My father has a background in engineering. His Phd was a process control system for petrochemical plants, written in Fortran if I remember correctly .
He now works with safety consultancy for computer related systems, or something like that, but he retains an interest in software engineering.
He recently sent me an email about software specification.
I have just received a newsletter from LDRA, who sell tools for developing software that is used mainly in Military & Aerospace. LDRA claim that 70% of software defects found by their customers are requirements related. Allowing for sales hype (LDRA sell a requirements traceability tool), this is reasonably consistent with an HSE 1990s study of accidents in the process industry reported in “Out of Control” that showed the primary cause of accidents related to control systems as 44% “Inadequate Specification” .
I would be interested in your comments on this. Is "about half of software defects are requirements related" consistent with your experience to date?
I've recently being trying to evaluate the 'agile development' techniques we use at Resolver. I've not worked on large scale projects, and it is much easier to use techniques like test driven development on a completely new program. Below is my answer, where I attempt to articulate what I think about eXtreme Programming as opposed to 'big design up front' methodologies which cause these problems.
Software specification is 'difficult'.
This sounds like a problem with the 'big design up front' methodology. This is where the first stage of software creation is a full design and specification. If any of this is partially, ambiguously or incorrectly specified then the software will be faulty. As it is almost impossible to fully understand all the ramifications of a theoretical program implementation, mistakes are almost inevitable.
Agile development aims to provide an alternative.
Instead of spending the first few months of a project creating a massive specification document the project is divided into units corresponding to features: user stories.
These are specified by the customer. As each one is implemented by the software team a working, but only partial, program is always available.
The functional test for a user story tests all aspects of the feature as specified by the customer. More importantly, the customer has the implementation to test and use. Oversights and missing aspects that weren't apparent in the specification are more likely to be visible in the implementation.
The unit test framework should test every aspect of the implementation (as opposed to the functionality). There are tools being developed that can turn the unit test framework into a human readable software specification.
Agile development works well for an open ended project like Resolver. At every stage we have a working 'deliverable', and features are continually added. Customer involvement in the design and testing process means that 'specification errors' are much less likely.
For a large engineering project is less easy to imagine it being approved, whether or not it is even feasible. Safety regulations and legal obligations require that a minimum amount of specifications be met: this minimum is likely to be substantial .
On top of that, the time and money spent on the project is determined by the customer rather than estimated by the development team. Individual features are 'costed' by the team, and they are prioritised by the customer. That way the team will always be working on the customers most important feature.
That means that no 'up front' cost or time budget is produced. Despite the fact that almost every major engineering software project is over time and over budget, this fact alone means that full scale 'agile development' is unlikely to be accepted for large engineering projects.
Obviously these techniques do not obviate the need for specification, they just break it up into more manageable and testable units. We still have problems getting our customer representative to write properly specified user stories ! As a result the features don't always do what he had in mind, so he is forced to re-specify. At least this system puts the responsibility for specification where it belongs, and provides quick feedback on the results.
The difficult part is to articulate why I think an iterative approach should produce a better system than a full initial specification. An initial design process should give you an overview of architecture which prevents you going up dead ends during the implementation phase.
The difficulty is that the practise of implementation is very different from any theoretical overview. An iterative process means that you need a flexible architecture which can be refactored as the specification grows. You should be working with the best implementation at every stage of the process. (Full test coverage makes refactoring much less scary.) The system you end up with can be very different from the architecture you would have created during an imaginary initial design phase.
The difficulty with an initial design phase is that you have to account for difficulties that might not arise, and you can't account for the real problems you'll have.
My Dad's reply :
Michael, very clear and I agree with your comments . The other reason we need a formal spec is that you are doing continuous "verification" and we often want both "independent verification" which should also be easy for agile methods (BTW - I think of methodology as the study of methods!) and also "validation" of the current complete user spec against the current version of the system. I assume "validation" would also be OK for you as it sounds as if you a doing "complete" testing at every stage?
The end result of BDUF and an iterative agile process should be similar, except that you have a differently (hopefully better) designed product along with full test coverage. The functional tests ought to act as validation of the user specs.
Additionally the tests must be written before the code. You should never have any untested (or unspecified) code in the official codebase. Code can only be checked in if the full test suite passes.
|||My Dad says it never worked, my Grandfather says it was groundbreaking for its time. As to which is true, probably both.|
|||References in the paper Can Technology Eliminate Human Error?.|
|||Although developing safety specifications in an iterative manner sounds more than feasible.|
Categories: General Programming
IronPython for ASP.NET
ASP.NET runs compiled code, which doesn't fit well with IronPython. This is a shame, because delivering web apps is one of the things Python does best.
Microsoft have addressed this with their latest CTP: IronPython for ASP.NET.
With Microsoft IronPython for ASP.NET, developers have the ability to create compelling web applications in the popular dynamic language for .NET, IronPython, using Visual Studio or the free Visual Web Developer.
IronPython for ASP.NET is a free extension to ASP.NET that is targeted at:
- ASP.NET developers looking to enjoy the simplicity and flexibility of a dynamic language, specifically IronPython; and
- Python developers looking to harness the power of ASP.NET and its rapid application development (RAD) environment.
This is a great step forward for IronPython.
IronPython Console & Tab Completion
There are many reasons for having a blog. Self aggrandisment is only one of them.
One of the more useful reasons is that a blog is a good place to put information I want to be able to find again.
The IronPython console provides tab completion and colour highlighting, but they aren't enabled by default. To switch them on, run the console with the command line :
ipy -D -X:TabCompletion -X:ColorfulConsole
Windows File Weirdness
Did you know this ?
It works whatever the current directory. Now if this really was the lazy web, someone would tell me why.
(Hmmm... 'nul' I can guess, that doesn't mean it isn't odd. Try saving a file with the name 'nul'.)
IronPython, Unmanaged Code and Ascii Art
This article has moved.
You can find the whole tutorial series at IronPython & Windows Forms.
As I may have mentioned before we run continuous integration at work.
That means that on a checkin to the subversion repository, a fresh check-out is done on our integration machine and a full build is run.
As well as being a sanity check that our build isn't broken, this process runs PyLint over our codebase, rebuilds the C# parts of Resolver and creates a fresh installer for the latest build.
For continuous integration we have been using Cruise Control .NET. When it works Cruise Control is great, but over the months it has caused a bevy of minor headaches.
The final straw came today, and we decided to see if we could replace it with a custom script. (This is the aforementioned CPython I was working on today that spawned the previous blog entry.)
We got a script together, combined with a subversion post commit hook (which are ludicrously easy to create) in about two hours and eighty lines of Python including blank lines
My boss won't let me show you the code, mainly because it is so young and probably needs a host of minor changes (he can't face the embarrassment!).
When we checkin it does a fresh checkout, runs the build and emails the developers with the results.
It uses subprocess and the svn command line tool to do the checkout.
Now don't get me wrong. I'm sure that there are lots of features of CC that we weren't using, and lots of minor ones that we will miss. This script will undoubtedly grow, but it will be a doddle to customise.
We won't miss the multi-megabyte XML log files that CC.NET generates, being emailed the output of the build script is much more useful. Being able to query the database  to see the state of any current build, including seeing tracebacks, minimises the infrastructure needed on the integration side.
One feature we will miss is a system tray plugin that displays a little bubble with success or failure when a build finishes. Time to work out how to do that.
|||Recent changes to our testrunner push the results into a database as the tests are running.|
Learning to Program
I've now been working with Resolver for over six months. I'm still really enjoying it.
It's been a great learning experience. The most significant things I've learned are :
Test driven development.
How to unit test and do functional testing.
Writing code for unit tests is very different to banging out code and fitting tests afterwards. Making decisions about what to test and how to minimise dependencies in what you test is also a learned skill. Mocks and monkey patching rule.
Learning how to think about architecture in a project much bigger than my personal projects has been a great experience. Minimising dependencies, keeping a clean and elegant structure and agonising over where code belongs.
But these things are general, and not Python.
I've also learned some C#, worked with parsers and syntax trees, quite a lot of .NET, a bit of SQL and working with databases. But these aren't Python either.
Working with Python 2.4 has been refreshing though. I've always tried to keep my code compatible with Python 2.3 (or even 2.2) but the IronPython base is Python 2.4 and they are starting to implement Python 2.5 features. I'm now a firm fan of sets, decorators and generator expressions .
Sets and decorators are very useful. The main reason I like generator expressions  is because, this :
looks nicer than this :
Having said all this (and thanks for your persistence if you have really read this far...) I feel like I passed a milestone last week when I finally wrote a working metaclass.
My apologies for the ramble, but I got thinking about this after coding some CPython at work today. Coding with IronPython is different to coding with CPython. A lot of what I've been doing recently has been writing (documentation, proposals and stuff) so I'd forgotten how much I normally use the standard library. Even though we have the standard library available with IronPython, we don't seem to use it very much.
It was only an eighty line script (I'll tell you about it in my next blog entry, which will be shorter than this I promise), but I felt much more up to speed zipping through the Python manual with its familiar modules than I do wading through the MSDN documentation (which isn't bad so don't misinterpret this).
The code was working with files and paths. Can you remember off the top of your head which modules the following functions live in ?
I've used all of these a lot, but remembering whether they are in os, os.path or shutil is a pain. It's such a shame that the proposal to move path.py into the standard library got rejected : a clean API is important.
My pet peeve is os.path.join
os.path.join('c:\\dev\\build', '../') 'c:\\dev\\something\\../'
WTF ? Oh yeah, I need to call normpath :
os.path.normpath( os.path.join('c:\\dev\\build', '../')) 'c:\\dev'
Why on earth would anyone want to call join without having normpath called ? Assuming I want explicit use of module names where I use them, why do I have to make the lines infeasibly long and remember to call normpath everywhere ? sigh
|||I also use properties and lambdas a lot more than I used to as a result of using them at work.|
|||And yes I do know the real difference between a generator expression and a list comprehension. However, I've not yet used a generator expression because of optimisation reasons (memory or otherwise). They do look better though.|
This year 'they' received 102 proposals and are able to accept 50-60 of them.
Andrzej Krzywda  and I have put forward one talk proposal and two tutorial proposals.
Andrzej originated as a Java programmer in Poland, including heavy doses of AspectJ. He graduated to agile development with Ruby (well Rails more than Ruby), and now says he regrets the time spent on AspectJ as it mainly addresses limitations in Java and AOP doesn't have much to offer 'dynamic languages' .
Andrzej has worked at Resolver Systems for around nine months, programming with IronPython. He has a keen eye for agile development and is constantly trying to improve our test framework and development techniques. Whenever we have a question about whether to test something, or how much to test, Andrzej is the one we turn to.
Andrzej has previously given the following presentations:
- The Academic IT Festival (Poland) 2006 - Developing with Ruby on Rails (including live coding)
- National Software Engineering Conference Poland (2004) - Aspect Oriented Programming with Java
Me, I'm just a Python fanboy.
(Hey, does this sound like an advert ? It wasn't meant to. Oh, and Andrzej will be thoroughly embarrassed by all this attention: but he's a great programmer.)
Anyway, the two tutorials proposed are :
- Developing with IronPython and Windows Forms
- Test Driven Development
I'm kind of wondering whether it good form to post the proposals here ? I can't see anyone else doing it, but I think they're good proposals and worthy subjects.
Due to the pressure of preparing a decent tutorial (a walk through application in both cases, including developing a test-suite alongside the application for the TDD proposal) we can only do one of the tutorials. Hey, that is assuming that one of them gets accepted. Needless to say Andrzej hopes the TDD proposal will be accepted and I hope that the IronPython one will fly.
The talk proposal is a half hour talk on developing with IronPython. This is probably one of the bemoaned case studies.
It's hard to see how much useful detail (teaching) can be done in half an hour, but probably a lot more than I imagine. At least we hope our talk (if accepted) will usefully equip potential IronPython programmers as well as provide an overview.
|||Who doesn't have a website and still owes me a good Ruby-Python comparison to publish.|
|||I apologise heartily in advance for mis-quoting Andrzej on subjects I don't understand, but that's what I heard when he was talking about Aspect Oriented Programming.|
A Python in your Pocket
The PythonCE project languished for a while, but has recently been revived by Luke Dunstan who has ported Python 2.5 (including ctypes and pysqlite). Tkinter is also available, but needs to be downloaded separately.
Everything seems to work fairly well except sdl_image and sdl_mixer don't seem to want to work? Anyway you can draw to the screen and get user input (softinput works except there is no esc key!).
The Python 2.4 binary is reported to work with Windows Mobile 5 (ctypes available as well), but I've not heard any specific reports about the new 2.5 binary. It would be nice to get some confirmation of this.
Every so often Seo Sanghyeon has been emailing lists of IronPython related links to the IronPython mailing list.
Mark Rees has collected these together in a new blog, with an RSS feed :
On the same subject I will hopefully have some exciting news within the next couple of weeks or so. Stay tuned.
SQL Server 2005, Triggers and a Big Headache
Today Christian and I spent part of the day wrestling with SQL Server 2005. It was probably only the afternoon, but it feels like it took about three weeks.
SQL Server 2005 has CLR integration. That means you can write code in C#, and cause it to be run when a database or table is created or modified. This is called a trigger.
We have a trigger that we want want to be executed when any of various tables are changed, specifically when they are modified via UPDATE, DELETE or INSERT.
We have a trigger that works fine, but with the minor drawback that it has the table name hardcoded into it. This means that adding a new table requires either creating a new database with the same table name, or re-compiling the trigger assembly with the new table name. We wanted to create a generic trigger, which we could add to the table with T-SQL commands and could work out for itself which table triggered it.
Simple hey ? Not on your nelly.
Now in theory a trigger has access to information about what caused the trigger. This is through the SqlContext Class, which should give you access to the SqlTriggerContext, through its TriggerContext Property.
From there we should be able to access the EventData, which according to Microsoft :
Gets the event data specific to the action that fired the trigger.
So how do they choose to provide this information ? Obviously they provide an SqlXml instance, containing the information in an undocumented XML format ! Great.
After wasting some time looking for examples of the way that this XML stored information we decided just to try it. Except it didn't work, we kept on getting null back for the event data.
A bit more digging and it turns out that this is only filled in for DDL queries: Data Definition Language queries. That means when you create a database or table. We wanted information about DML operations: Data Modification Queries.
So there must be some other way to find out, from inside the trigger, the table name that caused the trigger. Well after more googling we found several people asking the same question, and several people saying it couldn't be done. Until we found a comment on this blog entry.
With a little bit of magic you can find out which table is locked by the trigger. So long as your trigger doesn't hold any other locks, which ours doesn't, it works. :
SqlConnection conn = new SqlConnection("Context Connection=true"); conn.Open(); SqlCommand cmd = new SqlCommand("select object_name(resource_associated_entity_id) from sys.dm_tran_locks where request_session_id = @@spid and resource_type = 'OBJECT'", conn); SqlContext.Pipe.ExecuteAndSend(cmd);
But although this worked when we tried it, it didn't work in production for us; we got permission errors. So we finally got it working as a stored procedure, executed as the database owner, where we've set permissions for the users to do this. phew
Much kudos to Christian for getting all this working.
The final solution looks like this :
CREATE PROCEDURE dbo.GetCurrentTableName WITH EXECUTE AS OWNER AS BEGIN SELECT OBJECT_NAME(RESOURCE_ASSOCIATED_ENTITY_ID) FROM SYS.DM_TRAN_LOCKS WHERE REQUEST_SESSION_ID = @@SPID AND RESOURCE_TYPE = 'OBJECT' END GO GRANT EXECUTE ON DBO.GETCURRENTTABLENAME TO TESTUSER GO
It looks messy, but it's all run from a script, so creating new tables and databases with our trigger installed is now easy. If you think that sounded painful though, it was much worse...
Continuous Integration is 1337
At work we run continuous integration. All tests, functional and unit, should pass before checking in. After check-in the tests are run on the integration machine .
At least that's the theory. Over the last few weeks the number of randomly failing tests has built up  to the point where most test runs fail. So last week we changed the rules, if you see a random failure it's your job to fix it.
At the same time we checked in a new addition to the test runner implemented by Christian. It pushes test results into a database whilst the build is running. This means that we don't need to wait for the end of the build before we can see tracebacks for failing tests, and through a webapp we can examine them whilst the tests are still active on the machine running them.
When a build passes, it shows up in green.
Over the last few days we've gradually resolved the problems, and today Christian and I were finally able to checkin some work of ours.
Green means pass, the high number is how many tests passed (all of them for a green run of course), and the zero is how many failures.
|||Using Cruise Control .NET.|
|||It doesn't take many tests failing one time in ten to do this, but it does make them harder to diagnose...|
Firefox 2 Woes
I see that I'm not the only one who has had problems with firefox 2.
I upgraded yesterday only to find that tabs and my bookmarks had stopped working altogether.
I suspected that my extensions were the problem, and indeed they all seemed to think that they were still running on Firefox 1.5. Nothing I could do would coerce them into upgrading, so I uninstalled them all. This didn't help.
I uninstalled Firefox, and re-installed version 2, then reinstalled all the extensions that were comptaible with the new Firefox. All seems to be well now, but it was half an hour of pain I could have done without.
Whilst I was sorting Firefox I tried IE 7. It's quite nice. Now that Firefox is working again it feels much more familiar to me, so I won't switch. Besides which the find dialog is so much better, all find dialogs should work like that.
Earliest Reference to a Dictionary ?
In another response to a blog entry, this one about the Dictionary as a Basic Datatype, I had an email from Mike Huffman. He's found a reference to a data structure called a dictionary going back to 1983. Unsurprisingly it was from an Addison and Wesley book.
"A set ADT with the operations INSERT, DELETE, and MEMBER has been given the name dictionary."
Mike says that a few pages later the implementation uses a hash table.
In Python the initial set implementation was based on the dictionary. Perhaps the genesis of the dictionary was the reverse of this...
Another Python Charting Contender
I guess this is what they call the lazy web.
Several of you who read my entry on Python and charting components recommended Chart Director.
It's commercial, cross platform and with bindings to several different languages. As it has Python and .NET bindings, it can almost certainly be used from both CPython and IronPython.
The gallery of examples looks very impressive :
The developer and redistribution prices are also reasonable as these things go. It looks like you only need a single redistribution license per company, and one developer license per seat. (Most companies have only a single license, which include redistribution rights, which you must purchase per developer.)
Of course one obvious way to solve the no-one-GUI-is-really-suitable-for-multiple-platforms is to have a decent MVC architecture for your program.
To target a new platform you need new view code and some changes in the controller, but your model (if you've done it right) should be unchanged. Of course this means more code to maintain, but you can have the optimum user interface for multiple platforms.
Python Issue Tracker
Well, Roundup is now secure as the new Python issue tracker.
The process of transition away from the SF tracker has begun...
Python, GUI Toolkits and Charting
I'm looking to start a Python project that will involve creating charts from data-sets.
I've been wondering which GUI toolkit  to use and what charting packages are available.
Tkinter is not difficult to use, but I don't find it idiomatic (A posher alternative word for Pythonic). Although functional, complex GUIs are well within the reach of Tkinter, creating good looking ones require wrestling with it.
wxPython is allegedly 'difficult'. I've only used it via the Wax wrapper layer. The parts of Wax I've used have been straightforward and great, but Wax is no longer actively developed. It is small enough to maintain, and extend myself the parts I need if I decided to go with it. Dabo looks like a good alternative wrapper, but is a much bigger framework on top of wxPython.
The situation for building desktop applications with Python is not ideal, and lets face it there aren't many people doing it.
I've not seen a complex Python application with a professional looking GUI. Funnily enough the same is not true for Java. Azureus is a Java bittorrent client, and it is exceptionally good looking.
My ideal requirements are only the usual suspects :
- Under active development
- Cross platform
- Native looking on Windows, Linux and Mac OS X
- No external dependencies that can't be bundled with the program
- A license similar to the Python, BSD or MIT license
Basically none of these toolkits score well on all of these pointers, but wxPython seems to score the best. (In my subjective opinion.)
I've found a few likely looking charting projects.
This is the basic wxPython plotting component. It used to be called wxPyPlot.
It claims to require less cpu resources than the alternatives, and has less dependencies.
This is one of the most well known Python charting package. It requires numpy. It works with Tkinter and wxPython.
This has Python bindings. It's not quite what I'm looking for as it is more geared towards processing and displaying real time data, but the image quality looks good.
One obvious alternative for me, would be to use IronPython and Windows Forms. Windows Forms is a fantastic GUI framework and development language. For professional looking applications it has no equal in the Python world.
The dotnet world has a very different philosophy when it comes to large 'third party' packages. The dotnet framework has many more users than Python, so their is a thriving community including many free projects. Large packages are much more likely to be commercial though, with the quality and price covering the whole spectrum.
As a good example, Syncfusion is a company with 120 employees which produce components for the .NET framework, both Windows Forms and ASP.NET.
If you look at the samples, there are lots more possibilities (including image export, statistical support, user interaction, tooltips and more). The quality is excellent.
Unfortunately IronPython and Windows Forms breaks two of my ideal requirements. It isn't cross platform  and it requires the .NET frameworks 2.0 redistributable to be installed. The choice of platform for a new project is vital. Paul Graham lists Choosing the wrong platform as reason number 7 as to why new startups fail. It's a costly mistake (in terms of time at least for a personal project) to rectify as well.
I can't find any (non-commercial) functional test frameworks for Python GUIs other than PyQT: PyGUIUnit. It doesn't look too hard to extend for another toolkit, but it would be nicer to just use the beautifully crafted Resolver framework.
I'm not likely to start this project for another year, so I don't have to make any quick decisions, but it's an interesting subject.
|||The word framework has fallen out of favour in recent days...|
|||Windows Forms is at least partly implemented in Mono, but most third-party components don't work with it (at least yet anyway).|
Mobile Entertainment, Geocaching and Georeferencing
There are two new articles, by Rene Tse, in the Technology Section of
The first article is really two articles combined in one. The first part is about an interesting hobby with portable GPS devices, Geocaching. The second part illustrates georeferencing, associating pictures with location data, with a review of the Navman iCN 750.
The second article is about mobile entertainment. It looks at various different forms of content that you can access with your PDA, smartphone or mobile device. It includes a review of the possibilities (like streaming TV over Wi-fi), and several different programs you can use to access them.
Dictionary as a Basic Datatype
Over the summer we had a dotnet genius intern working with us called Max. He thinks that the term 'dictionary' was chosen by the CLR team (and recently Java as well) because it is the standard 'computer science' term for datatypes which store values by key. A typical implementation is an associative array, but it could also be a linked list or indeed anything else. The classic definition of a dictionary does not specify any particular implementation.
In lieu of other hobbies, I've taken to wondering how the term dictionary, became a standard part of computer science terminology ?
One of the reasons is that 'dictionary' is a good metaphor: you can get a very good idea of what a it does from the name. So 'dictionary' is a good fit for a data structure that stores values using a key, but I suspect that the term originates in a single implementation or white paper.
The dictionary has been in Python since the beginning. In fact it inherited them, including the syntax, from ABC. ABC was the language that Guido van Rossum worked on, and which inspired him to create Python.
Back in the early nineteen-eighties, when Guido worked on ABC, dictionaries were called tables. Guido started work on Python in nineteen-ninety, by which time (I assume) they had become dictionaries. So when (and how) did the term 'dictionary' come into use in computer science ?
According to Max :
Windows has included a Dictionary object as part of its standard library of ActiveX controls for use with WSH, VBS or whatever in the form of Scripting.Dictionary since at least 1998.
The BCL didn't use this term until the 2.0 release, where the version of the Hashtable that implements generics support was given the "Dictionary" name rather that retaining the Hashtable moniker (as Java did in its standard library when it implemented generics: HashSet begat HashSet<T> and so on). This was part of a wider move in the BCL to using standard computer sciency datatype names, as the generericised ArrayList was renamed to simply "List".
Anyone got any ideas ?
|||Object attributes, including modules, are stored in dictionaries, usually as a __dict__ attribute.|
|||By the way, interfaces may be part of the BSDM philosophy of programming; but sometimes they are a useful alternative to strong typing. For example, sometimes when implementing an API I've needed to know whether an object passed in is a list or a dictionary. As there is no way of telling from a Python object whether it supports the mapping API or the sequence API I had to fall back on typing.|
This work is licensed under a Creative Commons Attribution-Share Alike 2.0 License.