Python Programming, news on the Voidspace Python Projects and all things techie.
Windows Annoyance 2.37 E+17: Automatic Updates Dialog
Yet another Windows Annoyance with a solution. This solution comes from 4sysops: Disable restart after Windows Automatic Updates. It is another thing I have to do with every new windows box and can never remember where to find the instructions.
Usually when Windows does an update it wants you to reboot the machine. It does give you a choice, but even if you select 'Restart Later' it will pop up the same dialog every few minutes until you do restart your machine. This is very annoying, especially it pops up whilst you are doing something and accidentally quick 'Restart Now' and then lose whatever you were working on!
Luckily the solution is very simple, but it does itself require a reboot unfortunately.
- Click Start -> Run
- Enter gpedit.msc
- Go to Local Computer Policy -> Computer Configuration -> Administrative Templates -> Windows Components -> Windows Update
- Double-click on No auto-restart for scheduled Automatic Update installation
- Enable it
- Reboot the computer (!)
Excel, Big Numbers and Small Columns
Today at work I've been looking at (amongst other things), how Resolver should handle displaying numbers that are too wide for their column.
Excel (which we won't just be replicating) does a variety of different things, depending on what the situation is. As the range of things it does is slightly odd, I thought I would post it here. You never know, one day it might be helpful for someone...
- For large numbers (like one billion) excel adjusts the width of the column automatically. It doesn't shrink the column again if the number is then reduced.
- For very large numbers (like one hundred billion) it displays them in exponential format - 1E+11
- Large numbers with decimal places - it truncates the number so that all of it except the decimal point onwards is visible.
- For ordinary numbers with very thin columns it just displays '##' rather than the number.
Slow Builds: Distributed Build Instead of the Pipeline
InfoQ recently published an article by Amr Elssamadisy, on how pipelined builds are a bad solution to the problems caused by slow builds in conjunction with continuous integration: Is Pipelined Continuous Integration a Good Idea?
Continuous integration is one of the practises of eXtreme programming, the idea is that you have a machine (the continuous integration server) that monitors your central repository. When you checkin, the CI server does a fresh check-out and runs your full build - which needless to say will include a comprehensive automated test suite.
Developers will then get rapid feedback if they check-in code that breaks the build. This becomes a problem when your full build process becomes slow. Either developers run the full build before a checkin, meaning that it takes a long time before developers can check code in, or they run only a portion and wait for the CI server to warn you of breakage. When you have several teams (or pairs in full XP) working on code, they will probably already be building on top of code that now needs to be reverted or fixed.
Some teams have attempted to solve this problem with a 'pipeline build', where tests are run in a series of layers - the pipeline. This of course only delays the pain of failure, and merely gives the illusion of allowing more frequent checkins.
The article also quotes Julian Simpson of the famously agile Thoughtworks:
The other aspect of the pipeline that I find troubling is this: by not forcing the developers to sit through the functional tests before they check in their code, you prevent their fantastic brains from reflecting on improving them. If people are feeling the pain of running them, they have a good motivation to fix them. Those tests ought to provide the biggest bang for your buck: unless you're careful you could be running poor tests dozens of times a day.
At Resolver, we have exactly this problem. A full build now takes about two and a half hours on a fast machine and nearly an hour longer on our slow integration server (deliberately slow so that we always know that Resolver runs on slower hardware). Of this two and half hours our unit tests take about twenty-five minutes. A lot of these are really more like functional tests and could be made a lot faster. Learning to test has been a voyage of discovery for everyone in the Resolver team, and a lot of the unit tests have a live GUI loop, don't patch out dependencies or do real file / database IO. (We have several thousand unit tests and a few hundred functional tests.)
But, a lot of the functional tests can't easily be made faster. The point of functional tests is to test that the application works with as close to real user input as possible. I don't think that we should be mocking up database interactions in these situations - we want to be damn sure that Resolver works with real databases with all their quirks and bugs. Additionally, some of the tests are necessarily slow. We test printing with a print driver (actually pdfcreator) that outputs to an image. We do performance testing with large datasets and we have to repeat each of these several times to remove outliers. We also have Selenium tests for our server which aren't exactly speedy.
Often we revisit a functional test when improving a feature and find ways to make them faster (often they test too much), but readability of the tests and maintainability of our test framework is much more important that the time taken by individual tests.
Unfortunately this does impact on development. Slow builds make it harder to make checkins, which in turn makes conflicts more likely and you sometimes end up with several 'trees' waiting for a checkin and occasionally code gets lost. So what to do?
Well, we've long had a vision for a 'distributed build', and our latest hire (Kamil Dworakowski) has finally put one in place. It overlays our current build system and basically works like this:
- You run a 'prepare distributed build' script on any machine on our network - giving it a list of machines that will be running the tests
- This does all the prebuild stuff (runs pylint, builds the installers, builds documentation etc)
- It then copies the subversion tree it is run from into the 'buildshare' directories of all the machines that will be running tests
- It then copies a full list of all the tests into a database table
You then run a 'start scheduled build' script from all the slave machines (currently this is a manual step). The slave machine then pull batches of tests to run out of the table, marking them complete as it goes.
Our build process already has a webapp that shows us what builds are in process, how many tests have been run and allows us to look at tracebacks for test failures / errors even whilst the build is still in progress. The distributed build works within this system - all machines running the same scheduled build showing up as a single entry in the web app. The traceback messages include paths of course, so you can always tell on which machine individual failures happen. It also means that we can run several distributed builds without getting confused. When the build has finished, it is marked as complete with the total time taken and coloured green for success and red for failure.
At Resolver we practise pair programming, but we each have a desk and computer to call our own. This is down to the boss's experiences of doing XP in an IT shop that did pairing and so thought they only needed half the computers. Consequently all the developers felt homeless!
As a result though, at any one time we have several machines that aren't being used. Running a distributed build on three machines takes about an hour, which is a much more manageable time to get feedback from a code tree.
Perhaps this is just delaying the pain, and we really should be working on speeding up the build, but boy does it feel better.
We've recently hired (but they haven't started yet), which is two extra machines. We are also 'obsoleting' two machines (two years old) next month - which means two machines that we can dedicate to the 'build farm'.
Class Property Decorator
Last week I posted an entry on creating a class property with a metaclass. In the comments, a reader called Frank Benkstein suggested an alternative that is much more readable. (I updated the entry to include his suggestion.) It also happens to be the most straightforward use case for direct use of the descriptor protocol that I've seen so far.
It is easy to generalise from his code to a new type, called class_property, that can be used as a decorator in the same way as property:
def __init__(self, function):
self._function = function
def __get__(self, instance, owner):
Why the DLR and What is Python Used For?
Since Microsoft announced Silverlight 1.1, which includes support for the Dynamic Language Runtime and therefore IronPython, I have given several talks to .NET audiences about IronPython. A couple of times I've also been asked about IronPython by .NET folks who are talking about Silverlight and want to know more IronPython.
The two questions that seem to recur, are why did Microsoft include the DLR in Silverlight and what sort of things does Python get used for?
As I'm interested in Python advocacy, I thought I would post my answers to those questions here. Firstly they might be useful to someone else who gets asked similar questions, and secondly you might be able to help me improve my answers.
Why the DLR?
- Why did Microsoft create Dynamic Language Runtime, and in particular, why did they include it in Silverlight?
Dynamic languages are becoming more popular, and Microsoft were aware that developers who preferred these languages were migrating away from .NET.
Python is a particularly clean, expressive and concise language and makes an excellent development language.
The DLR provides three key benefits:
- Developer choice of language
- Ready made scripting languages for embedding in C# / VB.NET applications
- Dynamic languages are more suited to places where runtime class creation / code generation / introspection (reflection without the pain) is needed
Additionally they allow you to mix programming paradigms easily (procedural, functional, object oriented plus metaprogramming). Dynamic languages, without the restrictions of static typing, are also a lot easier to test. The current trend towards Test Driven Development is one of the reasons why some developers are choosing dynamic languages.
One of the reasons that Flash is not beloved by developers, is because you are restricted to a single development language. This is a lesson that Microsoft have learned.
The DLR is especially suited to Silverlight because the web is fundamentally a text based system (even HTTP is essentially a text oriented protocol), and dynamic languages remain text even when deployed.
It is interesting to note that the next iteration of Visual Basic (VB 10) will be a dynamic language built on top of the DLR. This enables Microsoft to provide many of the dynamic language features that they felt developers were asking for.
What is Python Used For?
Python is a great general purpose language and so gets used for a wide variety of things. It also grew out of the Unix culture, which is reflected in some of its major uses.
Python is used for web development:
- Youtube is written almost entirely in Python
- Python is one of google's '4 approved languages'. To quote Alex Martelli (a senior google developer), they use 'Python where we can, C++ where we must'
- Trac - the popular online project management app. is written in Python
- The massively multiplayer online game EVE is almost entirely written in Python on the server side
- Civilization IV is scriptable in Python, and the developers liked it so much that large chunks of the game ended up written in Python as well
Animation and CGI:
- Most of Sony Imageworks' animation pipeline is written in Python. It is also used by Pixar and Industrial Light and Magic.
- It is the embedded scripting language for applications like Blender and Maya
GIS (Geographical Information Services):
Python has become the 'scripting language of choice' for several major GIS applications (and some are starting to use IronPython).
System admin stuff:
- Gentoo (Linux) portage (package management) is written in Python
- Ubuntu and Pardus Linux distributions do everything they can in Python
Desktop applications - the original BitTorrent was (and is) written in Python.
Python has been used to implement two major distributed source code control systems: Bazaar and Mercurial.
It is now used internally in Microsoft - e.g. Microsoft Knowledge Tools.
Python gets used a lot in London banking (along with C# - so this another area that IronPython is gaining ground).
Python gets used a lot in science, because it is an easy language for scientists to learn. Particularly for genetic research (because of its powerful string handling capabilities) and linguistic analysis (because of an especially powerful toolkit called the Natural Language Toolkit).
It also gets used in industry, for example Seagate automate their hard drive testing with Python.
If you can think of any other major areas that Python gets used, or improved wording for any of this, then comment away.
Blog Action Day: Looking Back from the Future
Today is Blog Action Day, with bloggers from all across the world blogging about the environment. This is me taking part.
This year was the bicentennial anniversary of the abolition of the slave trade in the UK. In 1807 the trade in slaves from Africa was banned, but it wasn't until 1833 that slavery was abolished completely throughout the British Empire.
It is easy to view the behaviour of our ancestors with disdain. History is bulging with the barbaric treatment of fellow humans by people who considered themselves enlightened and civilized. In the UK we particularly look back at the Victorian era and the way they treated the poor, placing them in workhouses, and also the appalling treatment of the mentally ill (the 'insane') in asylums. Thank goodness that things have changed, and that generally in Western society things aren't so bad.
But to the people of the day, this was the state of the art - reflecting their moral convictions and the capacity of society to provide welfare. The horror of the workhouses was still preferable to previous incarnations of English society that depended entirely on the kindness of strangers for the care of the poor. If this kindness was absent or withheld, then like much of the world today, poverty was brutally terminal.
With a focus on looking back at the brutishness of a bygone age, it is easy to wonder how history will view our society. We have iPhones and space travel, the internet and modern medicine, laws on human rights and a society that supposedly honours diversity - boy are we sophisticated and civilized compared our predecessors. Is it possible that hundreds of years from now, people will read about our actions and inactions; and shudder at our ignorance and barbarity?
In the UK we imprison tens of thousands of people that our own doctors have diagnosed as being mentally ill. We know that prison doesn't work: most people who go into prison will be back again. Yet the prisons are overcrowded and the popular press calls for harsher sentences and the building of more prisons. This is not just inhumane, it is stupid. This situation is rarely the fault of individuals, but the fault of us - society.
In the US, hundreds of thousands of black young men are imprisoned. However remote they may feel that politics is, they are disenfranchised from any possible input. What barbaric society would do this?
Particularly in the UK you might hear the argument that society can't bear the cost of proper care of all the mentally ill who are in prison. The alternatives are prohibitively expensive and it is an extremely difficult problem to solve. This may well be true, but more to the point is that most individuals don't know and don't really care. How will future generations look back on this?
Even more so the current state of the environment. Scientifically it seems beyond doubt that man's (our - your and mine) activity is causing great damage to the earth. If nothing is done to change this then great harm will occur, and it won't necessarily be just our children and their children who suffer the consequences - things are happening now.
So how will future generations see this? Didn't they know? Didn't they care? Did no-one tell them, what did they think was happening? The fools, the selfish idiots... Will this be humanity's epitaph, or merely the way we as a culture are remembered?
I'm afraid that I'm not a very good green activist. I'm trying to find my way forward in doing less harm to the environment, consuming less and being more aware. I care greatly for the earth. All of man's finest achievements still pale into insignificance beside the unimaginable complexity and elegance of fragile life. Let's not be the ones who destroy it.
This work is licensed under a Creative Commons Attribution-Share Alike 2.0 License.