Python Programming, news on the Voidspace Python Projects and all things techie.

Lococast PyCon Podcast

emoticon:lighton Whilst I was at PyCon (which was awesome by the way) I recorded an interview with Rick Harding from the Lococast podcast. It's a half hour ramble around topics like IronPython, testing, PyCon, working for canonical, choice of operating system and other topics. It was good fun to chat to Rick and hopefully almost as fun to listen to:

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2011-03-22 00:27:23 | |

Categories: , , Tags:


AMD64 or X86-64?

emoticon:torch Up until recently the 64bit Windows binaries of Python were labelled as being for "AMD64" processors. If you know the history of this architecture then you know exactly what this means, but at least one person was confused and emailed in to ask:

Should I use the AMD64 version of Python on an Intel 64 chip? I know those 64-bit implementations are very similar, but are they similar enough that your AMD64 will work on Intel?

Christian Heimes offered this reply, and suggested an update to the download pages:

The installer works on all AMD64 compatible Intel CPUs.

As you most likely know all modern Intel 64bit CPUs are based on AMD's x86-64 design. Only the Itanium family is based on the other Intel 64bit design: IA-64. The name AMD64 was chosen because most Linux and BSD distributions call it so. The name 'AMD64' has caused confusion in the past because users thought that the installer works only on AMD CPUs.

How about:

  • Python 2.6.4 Windows X86-64 installer (Windows AMD64 / Intel 64 / X86-64 binary -- does not include source)

Martin Loewis (one of senior core-Python developers with a particular responsibility for the Windows releases) objected to the use of the term "X86-64" to describe this architecture:

AMD doesn't want us to use the term x86-64 anymore, but wants us to use AMD64 instead. I think we should comply - they invented the architecture, so they have the right to give it a name. Neither Microsoft nor Intel have such a right.

As a member of the Python.org webmaster team I was concerned that the descriptions were as useful as possible, and am not particularly interested in AMD vs Intel politics. I did a bit of digging to see if X86-64 was now a sufficiently generic term for the AMD64 architecture. Here's what I came up with, none of them individually conclusive perhaps, but indicators of how people understand and use the term:

In conclusion, referring to the AMD64 build as x86-64, with a footnote explaining which architectures this specifically means is unlikely to confuse people. It is definitely better than just saying AMD64.

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2010-03-16 19:24:26 | |

Categories: , Tags: , , ,


ctypes for C#? Dynamic P/Invoke with .NET 4

emoticon:cards One of the big new features of .NET / C# 4.0 is the introduction of the dynamic keyword. This is the fruit of Microsoft employing Jim Hugunin as an architect for the Common Language Runtime; charged with making .NET a better platform for multiple languages including dynamic languages.

dynamic is a new static type for C#. It allows you to statically declare that objects are dynamic and all operations on them are to be performed at runtime. This delegates to the dynamic language runtime, at least parts of which are becoming a standard part of the .NET framework in version 4.0. One of the main use cases for this is to simplify interacting between dynamic languages and statically typed languages. Currently if you use IronPython or IronRuby to provide scripting in a .NET application, or write part of your application in a dynamic language, all interaction with dynamic objects from a statically typed language (C# / VB.NET / F#) has to be done through the DLR hosting API. This is not difficult but can feel a bit clumsy. In C# 4.0 interacting with dynamic objects gets a lot easier.

dynamic can also be used for other purposes, like late bound COM or even duck-typing. Another interesting use is to create the same kinds of fluent interfaces that are both common and intuitive to use in dynamic languages. One example that most developers have worked with at some point is DOM traversal on document in Javascript, where member resolution is done at runtime. You implement this in Ruby with method_missing and in Python the equivalent is __getattr__ (or __getattribute__ for the really brave). In C# you inherit from DynamicMetaObject and override BindInvokeMember.

The Mono guys have been working on supporting dynamic in their implementation of .NET 4.0. Miguel de Icaza (lead developer of Mono) blogged about it recently: C# 4's Dynamic in Mono.

The .NET Foreign Function Interface (FFI - how you call into C libraries) is called P/Invoke (Platform Invoke). Included in Miguel's blog entry was an example, by Mono developer Marek, of creating a dynamic PInvoke. Miguel describes it as "similar to Python's ctype library in about 70 lines of C# leveraging LINQ Expression, the Dynamic Language Runtime and C#'s dynamic". Instead of writing wrapper methods for every function you call, they can be looked up at runtime!

The PInvoke class is demonstrated on Mono using libc and the printf function. The class itself compiles without change on the Visual Studio 2010 beta. To use it on Windows swap out libc for msvcrt and printf for wprintf:

dynamic d = new PInvoke("msvcrt");

for (int i = 0; i < 100; ++i)
{
    d.wprintf("Hello world, %d\n", i);
}

Very nice. Smile

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2009-08-14 17:14:19 | |

Categories: , Tags: , , , , ,


Return from finally

emoticon:tooth In several languages, like C# for example, a return inside a finally block is a syntax error. This isn't the case with Python. The reason languages choose to make it a syntax error is that it can lead to results that are surprising and arbitrary at best, and at worst code that simply doesn't do what it looks like it does.

Here are two examples of Python code that may or may not do what you expect:

>>> def f():
...     try:
...         return 3
...     finally:
...         return 4
...
>>> f()
4

This may or may not be the result you expect - it is likely you have just never considered the question and had no idea which to expect. The choice is arbitrary and it makes more sense just not to allow it.

Here's another, if anything worse, example:

>>> def f():
...     try:
...         raise Exception('foo')
...     finally:
...         return 4
...
>>> f()
4
>>>

A return inside a finally block silently swallows exceptions. This seems particularly egregious as if this is the behavior you want it is no effort to rewrite in a clearer form, where the intent of the code is more obvious:

>>> def f():
...     try:
...         raise Exception('foo')
...     except:
...         return 4
...
>>> f()
4
>>>

It even happens in the following code:

def f()
   try:
       raise KeyError
   except KeyError:
       raise Exception
   finally:
       return 4

Yes Python is normally very permissive, following the "we're all consenting adults" principle - but we don't have a goto and we don't allow programmers to monkey patch built-in types. There are few constructs in Python that are arbitrary, misleading or dangerous without compelling use case and return inside finally seems to me to be one.

As for silently swallowing exceptions (Errors should never pass silently. Unless explicitly silenced.), a break inside a finally does the same:

>>> def f():
...     while True:
...         try:
...             raise Exception('foo')
...         finally:
...             break
...     print 'got here'
...
>>> f()
got here

Oddly enough whilst break swallows the exception a continue in a finally is a syntax error and yield propagates the exception on resuming.

In my opinion these are unfortunate design decisions in Python and it would be better if a return inside a finally was a syntax error. I brought up the issue on the Python ideas mailing list and opinion was divided (now there's a surprise) but Guido said it isn't going to change - so that's that.

Note

I emailed the maintainers of Pylint suggesting that it would be good to have a Pylint checker that warned about use of break / return inside a finally as the exception swallowing may not be expected.

I received a reply from Sylvain Thenault saying "I've filed a ticket for this easy & nice to have functionality":

http://www.logilab.org/ticket/9776

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2009-07-18 12:44:07 | |

Categories: , Tags: ,


Open Source Licensing and Contributions

emoticon:test_tubes There have been discussions on the issues of licensing and accepting contributions to open source projects on the Python-dev and the testing in Python mailing lists.

This is an area that can be very confusing, and potentially problematic for open source projects. Just because a project is licensed under a free software license doesn't automatically say anything about the status of code contributed to the project. The copyright of contributed code is owned by the person who wrote it and if you merge it with your project you create a derivative work owned jointly by all contributors. You can't license the work to others (or change the license) without the explicit permission of all those who own the copyright. The 'standard' ways round this are either to require all contributors to assign copyright to the project or to have all contributors sign an agreement licensing all their contributions to the project. The second approach is the one taken by the Python project.

This advice, applicable to Python itself, was posted to the Python-Dev mailing list by Martin von Loewis:

Van's advise is as follows:

There is no definite ruling on what constitutes "work" that is copyright-protected; estimates vary between 10 and 50 lines. Establishing a rule based on line limits is not supported by law. Formally, to be on the safe side, paperwork would be needed for any contribution (no matter how small); this is tedious and probably unnecessary, as the risk of somebody suing is small. Also, in that case, there would be a strong case for an implied license.

So his recommendation is to put the words

"By submitting a patch or bug report, you agree to license it under the Apache Software License, v. 2.0, and further agree that it may be relicensed as necessary for inclusion in Python or other downstream projects."

into the tracker; this should be sufficient for most cases. For committers, we should continue to require contributor forms.

Contributor forms can be electronic, but they need to name the parties, include a signature (including electronic), and include a company contribution agreement as necessary.

For more information on copyright and how it applies to open source development I highly recommend the book by Van Lindberg Intellectual Property and Open Source.

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2009-07-02 16:59:00 | |

Categories: , Tags: ,


Dynamic Languages and Architecture

emoticon:world I received an interesting question by email from Mark Bloodworth, an Architect Evangelist at Microsoft.

I've been interested in Dynamic Languages for a while now (I blog about Ruby and Python from time to time). I'm presenting at AIC in May about Dynamic Languages and architecture. As part of the preparation, I'm contacting a few people to get their views on where dynamic languages fit best. So, I'd be really interested to hear your thoughts on Dynamic Languages, where you think the sweet spot for their usage is. I'd also be interested to hear what you think about the impact using dynamic languages has.

As is my wont, I wrote far more than was warranted in reply and thought I would post it here for your edification.

Dynamic languages are general purpose languages; Python is used for, and suitable for, many of the same problem domains as C#. There are some problem domains where a statically typed language is more to be preferred than a dynamically typed language - and vice versa - but in the realm of 'general programming' these are the exception rather than the rule.

Some of these domains, like real time programming for example, the requirement is not specifically about the type system (although strict type systems make provability easier) but it is about managing resources. You couldn't use Python for a real-time application because resource allocation and deallocation cannot be strictly controlled (the basic container types grow themselves in memory for example) - but you shouldn't use C# or Java either because garbage collection means that the time taken by any individual operation cannot be specified or controlled.

Note

The decision whether one language or another is always a trade-off of competing factors. I would argue that developers often make a less than optimal choice for flawed reasons, some of which I outline here.

Choosing one programming language over another is about balancing factors, and social factors are always a part of this. Forcing a team to use Python (or C#) even where it would be appropriate would likely turn out to be a mistake...

Here's a try that you will certainly find unsatisfactory. Areas where you should use a dynamically typed language:

  • Any kind of scripting where you need runtime behaviour
  • Including system administration where creating an object oriented application just to move a bunch of files around is silly

Areas where you should use statically typed languages:

  • Problems which are subject to provability
  • Areas of code that you have already profiled and optimised and need to improve beyond the capabilities of a dynamically typed language

The reason many developers prefer dynamically typed languages for general programming, including those who have switched from languages like C++ and Java, do so because they are more flexible and enable simpler solutions. The more I use C# (and I quite like it as a language), the more I become convinced that static typing forces (and therefore encourages) more complex architectures. The simplest thing that could possibly work will usually be implemented in a language with a dynamic type system!

Along with this, dynamically typed languages are typically more concise and expressive - meaning the same concepts can usually be expressed with less code. For those fluent in the use of their tools (whether it be Eclipse of Visual Studio) the code may not take longer to write in a statically typed language - but it certainly takes longer to read. And typically you have to read code more often than you write it.

Readability counts - a slide from a presentation by Guido van Rossum

A few examples off the top of my head:

  • Generics in C# - elegant for sure, but actually a workaround for the problems caused by containers restricted on type. Heterogeneous containers make constructing complex data-structures trivially easy by composing built-in types.
  • Reflection - because of the type restrictions .NET reflection is massively more painful and complex than the beautifully simple introspection capabilities of languages like Python, Ruby and Smalltalk.
  • Dependency Injection and Inversion of Control - sometimes useful in their own right, but often used as a workaround to make testing possible. Not needed for these reasons in dynamically typed, late bound languages where you can override almost all behaviour at runtime for the purposes of testing.
  • Covariance and contravariance - wonderfully complex things to wrap your head around. Not even an issue in a language like Python.
  • Delegates - a work around for not having first class functions.
  • Upcasting and down casting - not needed if you have runtime behaviour.
  • Although C# is growing support for functional programming, and will maybe grow support for metaprogramming, these have been a strong part of the culture of dynamic languages for years and also continue to grow.

I could go on...

However it is the case that when you develop within a system you learn to think within that system. For programmers used to a static type system, they use the type system to think within and reason about the programs they are building. They look with pain on the idea of dynamically typed languages as it takes away part of how they reason about programming. Similarly those used to dynamically typed languages are much more used to thinking in terms of object behaviour rather than types. Having to shoe-horn this into a type system that feels rigid makes it much harder for them to reason about programming. This is why the two 'camps' fail to see eye to eye - they speak different languages and think in different ways. I do think there are objective differences as well, some of which I have already outlined.

Besides this though, many programmers have been taught that static typing is safer and required for programming large systems. They rule out the use of dynamic languages for reasons that are either not recognised by those who program large systems in dynamic languages or are only partly true:

  • Managing large systems in dynamic languages is impossible

    Those who have moved to dynamic languages from statically typed languages are horrified at the idea of managing a large system, with its more complex architectural requirements and less readable code.

  • Without type safety dynamic languages are only suitable for advanced programmers

    It's quite a compliment to call us all advanced programmers, but as the languages tend to be easier to learn and easier to read. This makes them more suitable for new programmers and more powerful for advanced programmers.

  • Without type safety you're more likely to have bugs

    Type safety only catches a very small proportion of all possible bugs, and largely the ones that are easiest to find. If you think that just because a program compiles it doesn't have bugs then perhaps you haven't been a programmer for very long! Yes you can have runtime errors that a compiler would have caught. The minimal amount of testing would catch those (and automated tools like PyLint and PyChecker will also help). This does mean that testing is more important in dynamic languages - but I'm a firm believer that automated testing (and preferably test driven development) is preferable whatever language you are using. As dynamic languages make testing much simpler (see chapter 7 of IronPython in Action which is available free online), if you are a strong believer in testing you will love dynamic languages.

  • Statically typed languages tend to be faster

    Actually this is generally true. The rub is that it is possible to write assembly code programs that run slower than Java, and Python programs that run faster than C#. Performance (in the raw) depends far more on the programmer than it does on the language. In general dynamically typed languages are 'fast enough', and with IronPython moving performance sensitive parts of your application into C# is generally easy. The faster you can produce your first version, the more time you have to spend on optimisation!

    At Resolver Systems we have looked at performance in Resolver One (a large application written almost entirely in IronPython) several times. Every time so far we have got the performance we were aiming for by improving our Python code and haven't yet had to drop down to C#. We will look at performance many times again in the future (in fact we're probably due for another round of optimisation) and maybe we'll have to move some code away from Python - but looking at our algorithms and improving them is always our first step.

    Ruling out a language because you don't think it will be fast enough is a premature optimisation.

  • Tool support is not so good

    In general this just plain isn't true (the idea of the IDE and refactoring tools originate in Smalltalk implementations after all). When people say this I often suspect that what they mean is that they are afraid of moving away from Visual Studio. I wrote a blog entry listing some of the tools for Python in particular: http://ironpython-urls.blogspot.com/2009/03/writing-ironpython-debugger.html

    It is true that because the type system isn't fully known until runtime you can't do some of the same static analysis (although there is an awful lot that a good tool can infer). Martin Fowler, who has seen Thoughtworks increase the amount of projects done in Ruby over the last few years, recently said that programmers within Thoughtworks who move from Java to Ruby usually end up using editors like Textmate, Vim and Emacs (lightweight tools for lightweight languages) and he has never heard of any of them missing the refactoring support.

Anyway - I'm not sure if this is what you were looking for, but it's enough typing for one email. I hope it is helpful or interesting.

Parts of that reply are verging on a rant of course. I think part of what I'm ranting against is the inherent complexity in many modern languages and runtimes. Some of the problems I mention, and some of the ones below, are at least partly (or even completely) orthogonal to the type system - but languages, their type systems, and runtimes they are predominantly used on are often so tightly bound together that it is impossible to fully distinguish them. The issues I see are typified in the mainstream statically typed languages: Java, C#, VB.NET, C++, C. The mainstream dynamic languages typically don't have this class of problem: PHP, Perl, Ruby, Python, Javascript.

Specific problems that I have actually encountered with C# and the .NET framework that can't happen with Python include:

  • The differences between value types and reference types - especially the problems around boxing of value types and mutable value types
  • Uninitialized reference types can be null (which can't happen in Python unless you explicitly assign something to be None)
  • Can't overload on return type (partly why you need Func and Action in .NET 3.5)
  • Can't cast between delegates with the same signature

I would include covariance and contravariance as issues that fall within the same 'inherent complexity' of the .NET system. The problem is that we programmers are prone to loving complexity, we mistake it for power. In fact the opposite is true, conceptually simpler systems tend to be more powerful.

With Python all variables are references, no value types, which removes a whole heap (pardon the pun) of complexity. Of course this is a trade-off - in particular having value types allows for certain optimisations. This means that complexity has been moved onto the programmer for the sake of the compiler. Modern JITs (like the one being explored in PyPy for example) can move the complexity into the runtime, allowing for the same optimisations.

Of course dynamic languages have problems too. A lot of the ones pointed out to me are around issues of programmer discipline. With more flexible languages you can do lots of things that won't work (but you won't be warned until runtime). This is one of the reasons that some .NET programmers have told me they think dynamic languages are only suitable for more advanced programmers. The reason I take issue with that is that one of the trade-offs you are making when you leave a dynamic type system is that you are moving to a system with more inherent complexity. This is not something that can possibly be better for less advanced programmers and can hardly make mistakes less likely!

Of course non-mainstream languages like the functional languages and those with more complicated type systems present whole new fields of problems for the beginner and experienced programmers alike. Steve Yegge talks a lot of sense on the subject in his article Rhinos and Tigers (Static Typing's Paper Tigers).

On the subject of language design, I like Jim Hugunin's [1] quote in his story of Jython:

Guido's sense of the aesthetics of language design is amazing. I've met many fine language designers who could build theoretically beautiful languages that no one would ever use, but Guido is one of those rare people who can build a language that is just slightly less theoretically beautiful but thereby is a joy to write programs in.
[1]Jim also wrote a great foreword to IronPython in Action.

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2009-04-19 23:59:10 | |

Categories: Tags: , , ,


Sod This! Another Podcast

emoticon:bluetooth Sod This is a new podcast by well known .NET MVPs, Devexpress evangelists and all round (figuratively of course) raconteurs: Gary Short and Oliver Sturm.

Episode 3 is now up, and it's an interview with me on dynamic languages in general and IronPython in particular. (Before becoming a .NET programmer Gary was a smalltalk developer.)

The interview took place during the BASTA conference in Germany; in a bar, so the audio starts of a bit rough but improves as the interview progresses. I even reveal my mystery past and what I did before programming in Python.

Oh, and just for the record - I was the first Microsoft dynamic languages MVP. Smile

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2009-04-15 22:36:22 | |

Categories: , , Tags:


Setting Registry Entries on Install

emoticon:info On Windows Vista setting registry entries on install is hard. It is likely that Microsoft don't care - the official guidance is to set entries on first run and not on install, but there are perfectly valid reasons to want to do this.

The problem is that for a non-admin user installation requires an administrator to authenticate, and the install then runs as the admin user not as the original user. So if you set any registry keys using HKEY_LOCAL_USER then they will be set for the wrong user and not visible to your application when the real user runs it.

The answer is to set keys in HKEY_LOCAL_MACHINE, which is not an ideal answer but at least it works. The problem comes if you need write access to those registry keys; the non-admin user doesn't have write access to HKEY_LOCAL_MACHINE. When you create the key you can set the permissions though,but you need to know the magic incantations. In IronPython (easy to translate to C# if necessary) the requisite magic to allow write access to all authenticated users is:

from Microsoft.Win32 import Registry
from System.Security.AccessControl import RegistryAccessRule, RegistryRights, AccessControlType
from System.Security.Principal import SecurityIdentifier, WellKnownSidType
REG_KEY_PATH = "SOFTWARE\\SomeKey\\SomeSubKey"
key = Registry.LocalMachine.CreateSubKey(REG_KEY_PATH)
ra = RegistryAccessRule(SecurityIdentifier(WellKnownSidType.AuthenticatedUserSid, None),
                                      RegistryRights.FullControl, AccessControlType.Allow)
rs = key.GetAccessControl()
rs.AddAccessRule(ra)
key.SetAccessControl(rs)
key.Close()

Of course the next issue is what happens when you make your application run as 32bit on a 64bit OS (to workaround in part the horrific performance of the 64bit .NET JIT). Hint, the registry keys will have been created in the WOW6432Node. If you want to use the standard locations and share between 64 and 32 bit applications then you need to look into reflection (which copies keys between the 32 and 64 bit registry trees) with RegEnableReflectionKey (although it's not entirely clear whether you need to enable or disable reflection to share keys, but thankfully I haven't yet needed to experiment with this).

Update

If this wasn't all enough fun for you, under some circumstances you can end up with registry virtualization. This is where your registry keys end up in an entirely separate registry hive called VirtualStore under the root node.

You can find a reference on virtualization (which can also cause file locations to be virtualizaed - making them visible to some applications and invisible to others) on this page.

In our case deleting the VirtualStore restored sanity.

Most of the details in this entry only apply to 64 bit Windows Vista.

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2009-04-15 16:58:37 | |

Categories: , Tags: , , ,


Distributed Test System at Resolver Systems

emoticon:firefox There is quite a discussion going on the Testing in Python mailing list. A lot of it was kicked off by Jesse Noller discussing the new distributed testing framework he wants to build.

Just for the record, here is a rough outline of how we do distributed testing over a network at Resolver Systems. It is a 'home-grown' system and so is fairly specific to our needs, but it works very well.

The master machine does a full binary build, sets up a new test run in the database (there can be multiple simultaneously), then pushes the binary build with test framework and the build guid to all the machines being used in the build (machines are specified by name in command line arguments or a config file when the build is started). The master introspects the build run (collects all the tests) and pushes a list of all tests by class name into the database.

When the zip file arrives on a slave machine a daemon unzips and deletes the original zipfile. Each slave then pulls the next five test classes out of the database and runs them in a subprocess. Each test method pushes the result (pass, failure, time taken for test, machine it was run on, build guid and traceback on failure) to the database. If the subprocess fails to report anything after a preset time (45 mins I think currently) then it kills the test process and reports the failure to the database. Performance tests typically run each test five times and push the times taken to a separate table so that we can monitor performance of our application separately.

The advantage of the client pulling tests is that if a slave machine dies we have a maximum of five test classes for that build that fail to run. It also automatically balances tests between machines without having to worry about whether a particular set of tests will take much longer than another set.

A web application allows us to view each build - number of tests left, number of passes, errors and failures. For errors and failures tracebacks can be viewed whilst the test run is still in progress. Builds with errors / failures are marked in red. Completed test runs with all passes are marked in green. Easily being able to see the total number of tests in a run makes it easy to see when tests are accidentally getting missed out.

A completed run emails the developers the results.

The web page for each build allows us to pull machines out whilst the tests are running. If a machine is stopped then it stops pulling tests from the database (but runs to completion those it already has).

Machines can be added or re-added from the command line.

We have a build farm (about six machines currently) typically running two continuous integration loops - SVN head and the branch for the last release. These run tests continuously - not just when new checkins are made.

This works very well for us, although we are continually tweaking the system. It is all built on unittest.

The system that Jesse Noller will have as its foundation a text based protocol (XML or YAML) for describing test results. These can be stored in a database or as flat files for analysis and reporting tools to build on top of.

For a test protocol representing results of test runs I would want the following fields:

  • Build UUID
  • Machine identifier
  • Test identifier: typically in the form package.module.Class.test_method (but a unique string anyway)
  • Time of test start
  • Time taken for test (useful for identifying slow running tests, slow downs or anomalies)
  • Result: PASS / FAIL / ERROR / SKIP
  • Traceback

Anything else? What about collecting standard out even if a test passes? Coverage information?

We sometimes have to kill wedged test processes and need to push an error result back. This can be hard to associate with an individual test, in which case we leave the test identifier blank.

Extra information (charts?) can be generated from this data. If there is a need to store additional information associated with an individual test then an additional 'information' field could be used to provide it.

A lot of the other discussion on the mailing list has been around changes and potential changes to the unittest module - changes that started in the PyCon sprint. I'll be doing a series of blog posts on these in the coming days.

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2009-04-11 16:55:48 | |

Categories: , Tags: ,


Essential Programming Skills: Reading and Writing

emoticon:dove As a programmer there are two basic skills vital to your productivity: how fast you can type and how fast you can read.

On typing, Steve Yegge said it best of course in Programming's Dirtiest Little Secret.

I often mock Mr. Tartley for being a hunter pecker, but he can really type quite fast with his two fat fingers. I taught myself to touch type with Mavis Beacon back when I was selling bricks and found it enormously freeing. Being able to type without having to look at the keyboard makes a massive difference.

There are a host of tools that will help you learn or practise touch typing. I've just discovered (via Miguel de Icaza) a fun web based one, that you can fire up at any time. You race against other players typing short passages from books, with visual cues when you make mistakes. It even lets you setup private games to race against a set of friends. My only criticism is that there isn't enough punctuation to really practise typing for programmers (programmer specific version anyone?):

The combination of competition, short doses and interesting passages make it fun, addictive and actually useful. My average WPM is 52 at the moment, but I reckon if I practise a few times a day I'll pick up speed.

The correlating skill essential for programmers needing to browse countless pages of documentation and information from blogs that may or may not be useful is speed reading. The following tool is great for practising, but I'm also finding it useful for quickly reading long passages of text.

It shows you the text a line at a time, moving the focused part quickly (at a speed you can configure and control from the keyboard) from left to right. This mirrors (unsurprisingly) the way I skim read blogs etc. The problem is that I often involuntarily skip passages whilst skim reading - this tool is good for practise but also helps me to read quickly without missing bits.

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2009-04-11 16:26:32 | |

Categories: , , Tags: , ,


Interface design, Usability and Kitchen Hobs

emoticon:scanner I'm very interested in application usability. I'm aware that for those who aren't computer experts they can be complex and bewildering. As someone who has been using using computers inside and out for years this is something that is easy to forget; that the metaphors on which a modern OS is built are not 'intuitive' to the average human. To use modern computers fluently requires a huge amount of background experience that only comes from time, and even better from taking apart machines and programs to understand what is really going on underneath the metaphors.

Poorly designed applications and user interfaces on top of this make computers painfully difficult. On the other hand computers and applications can be usable, and if the things we create don't make people's lives less painful then we're wasting our time. I'm also interested in programming language design, and I like this quote from Guido van Rossum: "a programming language is an HCI" (Human Computer Interface). Programming languages are compromises between the language that computers speak (an electromagnetic dance where even 1s and 0s are an abstraction for the sake of humans) and a language that we're capable of reading and writing.

User interface design isn't restricted to the world of computers. My Dad is a health and safety consultant, specialising in health and safety as it applies to computer controlled systems. This includes industrial plant and machinery, power stations, oil rigs and the like. One of the most common causes of accidents, including fatal accidents, is operator error. Behind the operator error is often a poorly designed user interface, making the system or machinery much harder to operate correctly under pressure: bad UIs kill!

Because of this my Dad has also been interested in interface design for many years. He tells me that some of the best studies in the field are based on child psychology and there are aspects of interface design that can be categorised as objectively good or bad. One thing that every child learns early in its development is the principle of one-to-one correspondence. In its basic form you can see this in young children learning to use their limbs; that signals they send from their brain result in a corresponding movement in their arms and legs is a fascinating voyage of discovery for young babies.

This principle of one-to-one correspondence is baked into us and applies to the HCI. When we move the mouse physically with our hands the pointer moves in exact correspondence; which makes it easy to learn.

One example of an astonishingly badly designed user interface is the kitchen hob. The most common design has four cooking rings arranged in a square pattern. If it had four control knobs arranged in the same pattern then everyone would know which knob corresponded to which ring intuitively. The most common design pattern for hobs lays the four knobs out in a straight line; violating one-to-one correspondence with the consequence that you have to look and work out which one you need to use (and if you're anything like me probably getting it wrong anyway). If the knobs were arranged in the same pattern as the rings then everyone would find them straightforward to operate; one-to-one correspondence at work.

Like this post? Digg it or Del.icio.us it.

Posted by Fuzzyman on 2009-03-08 14:43:47 | |

Categories: Tags: , ,


Hosted by Webfaction

Counter...