Python for .NET Programmers

An Introduction to IronPython

IronPython in Action

Note

This is a short introduction to the Python programming language for .NET programmers interested in IronPython. If you're completely new to IronPython you should first read An introduction to IronPython.

This article is the second in a series of articles on developing with IronPython. The other articles are:

A much more complete introduction to Python and IronPython is my book: IronPython in Action.

Introduction

IronPython is an implementation of the popular open source programming language Python for the .NET framework. IronPython is built on top of the Dynamic Language Runtime. This article is a quick guide to the Python programming language. It is aimed at .NET programmers but should be understandable by anyone with previous experience in an imperative programming language like Java.

Python has much in common with C#. The core object models are similar, as is much of the syntax; C# and VB.NET programmers should find it easy to learn the basics of Python. There are plenty of differences though, including some fundamental ones.

In this article we'll be looking at the following aspects of programming with Python:

  • why dynamic languages?
  • built in types
  • basic flow-control and exceptions
  • functions
  • classes
  • modules and importing
  • scripting, functional programming and metaprogramming

This article isn't a replacement for the full Python documentation or a comprehensive tutorial. See what next? for useful online references on learning Python.

Note

For learning Python a great reference is the Python tutorial. I've built an online, interactive version of the tutorial with IronPython and Silverlight: Try Python.

Dynamic Languages

Unlike the traditional programming languages for .NET Python is dynamically typed. This means that you don't have to declare the types of your objects as they are determined at runtime.

This is much more flexible, you use objects based on their behavior (called duck typing) rather than just their type. The cost is that much less can be determined at compile time so you lose type safety. It is also harder for tools to deterministically tell you the type of objects and where they are used. Despite this there is an awful lot that can be determined or inferred by static analysis of Python code.

Note

For a list of IDEs that support IronPython and their features, along with some standard Python development tools for things like code quality and refactoring, read the article: IronPython Tools and IDEs.

Before we look at the benefits of dynamic languages let's quickly look at the cost of static typing.

What does type safety buy?

Type safety does eliminate particular classes of errors.

For example, the compiler can assure you that when using the return value from your integer addition method, 1+1 always returns an integer.

But this is not sufficient to confirm that an application actually works.

In order to have confidence about this, the very best method known to todays computer science is automated tests. Unit tests and acceptance tests.

Unit tests are always needed, and they are much more detailed about run-time behaviour than static type checking can ever be.

With tests in place confirming correct values, then checks of correct type are now redundant, and can safely be removed.

—paraphrased from Jay Fields, Ruby luminary, card shark.

In 5 years, we'll view compilation as the weakest form of unit testing.

—Stuart Halloway

In practice, the benefits of type safety turn out, unexpectedly, to be fairly minimal. Often overlooked, the costs of maintaining type safety turn out to be extremely high.

Dynamic Languages tend to be

  • Short on ceremony:
    • Good for education
    • Expose to power users for scripting
    • Small code base
  • Powerful and expressive
    • Can express powerful ideas simply and cleanly
    • Flexible - remain simple for disparate types of problem.

For example, the simplest "Hello World" in C#:

class Program
{
    static void Main(string [] args)
    {
        Console.WriteLine("Hello, world");
    }
}

In Python:

print 'Hello, World'

Some of the ways that dynamic languages tend to be different from statically typed languages:

  • No need for explicit type declarations
  • First class and higher order functions instead of delegates
  • No need for generics, flexible container types instead
  • Protocols and duck-typing instead of compiler enforced interfaces
  • First class types, functions and namespaces and the ability to modify objects at runtime means that patterns like dependency injection and inversion of control aren't necessary just to make code testable (although they can be useful in their own right)
  • Easy introspection without the pain of reflection
  • Problems like covariance and contravariance and casting just disappear

Programming in dynamic languages, although superficially the same as programming in a statically typed language, involves thinking about programs in very different ways. An idiomatic C# programmer will tend to reason about programming in terms of types, which makes dynamic languages feel loose and unreliable. Programming in a dynamically typed language involves reasoning in terms of object behavior instead, which is why programmers used to dynamic languages find statically typed languages so restrictive. Of course these statements are broad generalisations but they go some way to explaining the sharp divide between those who strongly prefer one or the other.

Programs that can be written with a static type system are a subset of all possible programs. For some people this is enough.

—Gilad Bracha (one of the JVM architects)

Face it. The history of programming is one of gradual adoption of dynamic mechanisms.

—Patrick Logan

Multiple Programming Paradigms

Python (and dynamic languages in general) support multiple programming paradigms:

  • Interactive
  • Procedural
  • Functional (closures are very important)
  • Object Oriented
  • Metaprogramming

Python is an object oriented language - everything is an object. This doesn't mean that you have to use the object oriented style in your own programs. In fact you don't even need to use functions but can simply create 'scripts' that perform one off operations.

This is exemplified in the interactive interpreter, which as well as being very useful for demos can be a serious programming tool. I know of several .NET programmers who don't use IronPython in their production code but use the interpreter to explore and experiment with new assemblies. I also know a Python programmer (Python and SQL actually) who does much of his work at the interactive interpreter - he likes to suck data in, transform it, push it back out and then walk away.

First class functions and types, along with closures (lexical scoping) make the functional style of programming possible. Although Python isn't a pure-functional language higher order functions and function factories are very common in Python code.

Facilities for metaprogramming include metaclasses, code generation, decorators and so on.

Learning Python

The core object model of Python is similar to imperative languages such as C#, Java, and VB.NET. If you’ve used any of these languages before, even if aspects of the syntax are unfamiliar, you’ll find learning Python easy. Python does differ from these languages in how it delimits blocks of code. Instead of using curly braces, Python uses indentation to mark blocks of code. Here’s a simple example using an if block:

if condition == True:
    do_something()

Python is case-sensitive and the comment symbol is #.

Python is fully object oriented, everything is an object, but you can use it to write procedural or functional style programs. There is no need to write a full object-oriented application if you are just creating scripts.

Basic Syntax

Python

Python basic syntax for functions, classes and modules:

import module
from othermodule import something

MODULE_LEVEL_CONSTANT = 3

def function(arg, arg2=None):
    while arg:
        arg -= 1
    return arg2


class ClassName(BaseClass):

    def __init__(self, arg):
        # constructor
        self.value = arg


instance = ClassName('foo')

In Python namespaces (modules) are created by individual files - the physical container (the file) is the same as the logical container (the namespace). Files can be organised as packages (groups of files) to group namespaces.

The naming of Python variables (constants, classes, members) follow conventions (module level constants are usually all caps for example) but these are not enforced. We'll look more closely at some of the details of the syntax shortly.

Python Datatypes

Strings

a = 'single quoted'
b = "double quoted"
c = 'Normal\nEscaping \t rules'

d = """Triple quoted
spanning multiple lines"""
e = r"Raw string where backslashes \ are treated literally"

string = str(some_object)

CPython has two string types: str and unicode. IronPython only has Unicode strings, which is probably the biggest difference between IronPython and the standard C Python implementation. IronPython does lots of magic to let you treat strings as bytestrings for compatibility with CPython. I've seen few problems in practise because of this difference (and where there are problems it's a bug).

In Python 3 the C implementation also moves to Unicode only strings, which is a great improvement.

Numbers

On the .NET framework the Python integer is System.Int32 and the float is System.Double. Python auto-promotes overflowing integers to long integers. This means that you don't need to know the result of a calculation before performing it... Python also has a built-in complex number type.

a = 32
b = int('12')

c = 0.2
d = 12e32
e = float('13.6')

f = 10L
g = 1000 ** 1000
h = long(10)

i = 3 + 2j
j = complex(3, 2)

For decimals you have the choice of using System.Decimal (fast) or the Python decimal module (compatible with code written for CPython).

Containers

The Python built-in container types are heterogenous (can contain elements of different types). You can compose the different container types to create complex data-structures without needing to define custom types:

>>> points = {}
>>> points[(4, 8)] = {'height': 56.0, 'name': 'The middle'}

The built-in container types are:

  • the dictionary (hash table) can use any hashable (i.e. immutable) object as keys and store any object as values:

    >>> a = {}
    >>> a['key'] = 'value'
    >>> a
    {'key': 'value'}
    >>> print a['key']
    value
    >>> del a['key']
    >>> a
    {}
    >>> # Creating a dictionary from a list of key -> value pairs (as tuples)
    >>> b = dict([('key', 'value'), ('key2', 'value2')])
    >>> b
    {'key': 'value', 'key2': 'value2'}
    >>> b['key2']
    'value2'
    
  • the list is a mutable ordered sequence of members:

    >>> a = [1, 2, 3, 4]
    >>> a[0] = 0
    >>> print a[0] # first member
    0
    >>> print a[-1] # last member
    4
    >>> a
    [0, 2, 3, 4]
    >>> del a[0]
    >>> a
    [2, 3, 4]
    

    Lists have many useful methods for working with them:

    >>> a.remove(2)
    >>> a
    [3, 4]
    >>> a.append(3)
    >>> a
    [3, 4, 3]
    >>> a.insert(0, None)
    [None, 3, 4, 3]
    >>> a.sort() # in place sort
    >>> a.reverse() # in place reverse
    >>> a
    [4, 3, 3, None]
    

    Lists support slicing for fetching, setting and deleting members:

    >>> print a[2:] # from third member to the end
    [3, None]
    >>> print a[:2] # from the start up to (but not including) the third
    [4, 3]
    >>> print a[1:3] # second and third members
    [3, 3]
    >>> print a[::2] # extended slicing - from start to end, step 2 (skip alternate members)
    [4, 3]
    
  • the tuple is an immutable ordered sequence (they can be used as dictionary keys):

    >>> a = (1, 2, 3)
    >>> a
    (1, 2, 3)
    >>> b = tuple([3, 2, 1])
    >>> b
    (3, 2, 1)
    

    Tuples can be indexed and sliced in the same way as lists, but you can't add or remove members. Tuples and list overload the add operator. Adding tuples produces a new tuple:

    >>> a[0]
    1
    >>> b[-1]
    1
    >>> a + b
    (1, 2, 3, 3, 2, 1)
    
  • the set is an unordered collection of members (there is no syntax for creating sets until Python 3):

    >>> a = set()
    >>> a.add(1)
    >>> a
    set([1])
    >>> b = set([1, 2, 3, 4])
    >>> b
    set([1, 2, 3, 4])
    >>> b.remove(3)
    >>> b
    set([1, 2, 4])
    >>> b.pop() # remove and return an arbitrary member
    1
    >>> b
    set([2, 4])
    

    Like dictionaries, sets can only be used to store hashable (immutable) objects.

Sets, lists and dictionaries all have useful methods. The Python documentation and the interactive interpreter will be your friend when working with the built-in container types. In addition there is Python syntax that translates under the hood into method calls. These are the Python 'magic methods' (protocol methods) we'll be looking at shortly:

>>> a = [1, 2, 3]
>>> 1 in a
True
>>> a == [1, 2, 3]
True
>>> len(a)
3
>>> bool(a)
True
>>> a + [4]
[1, 2, 3, 4]

The Python standard library provides a great deal of 'non built-in' Python data-structures like arrays, named tuples and double ended queues. Of course IronPython can use all the .NET data structures as well.

Booleans, None and truth testing

Python has three useful built-in objects: None, True and False. True and False are the booleans and None is the .NET null (except in Python it is, like everything else, a first class object).

In Python None, 0 (both int and float), the empty string (''), and empty containers all evaluate to False. By default everything else evaluates to True. Classes are able to customize this, if you implement your own containers or data-structures it is normal for them to evaluate to False when empty and True otherwise.

Truth testing is done explicitly by calling bool (the Python boolean type) on an object or the result of an expression, or implicitly in if and while statements:

>>> bool(None)
False
>>> bool(object())
True
>>> bool([None]) # a non-empty list
True
>>> bool([[]]) # a list containing an empty list is not empty...
True

.NET methods on Python types

In IronPython strings are normal .NET strings. However, by default they have all the 'usual' Python methods but not the .NET methods you might expect:

>>> a = 'some string'
>>> a.title()
'Some String'
>>> a.ToUpper()
Traceback (most recent call last):
 ...
AttributeError: 'str' object has no attribute 'ToUpper'

This is to keep on the right side of the Python community who might object to the built-in Python types gaining a host of extra methods and properties. To please the .NET community who use IronPython but want to use the methods they are familiar with you can 'switch on' the .NET methods in a namespace by executing import clr. This signals to IronPython that the code in this namespace is interoperating with .NET and makes .NET methods visible:

>>> import clr
>>> 'some string'.ToUpper()
'SOME STRING'

You can see this at work by doing dir('a string') at the interactive interpreter both before and after importing the clr module.

Basic constructs

Conditionals:

if 1 > 2:
    print 'not possible'
else:
    print "that's better"

Iteration (looping):

for a in range(100):
    if a % 2: # % is the modulo operator
        continue
    print a
else:
    # only entered if the loop is
    # exited normally (without a break)
    pass

The while loop:

a = [1, 2, 3, 4]
while a:
    b = a.pop()

    if b > 3:
        break
else:
    # only entered if the loop doesn't break
    pass

Exception handling:

try:
    raise Exception('boom')
except:
    print 'an exception was raised'

try:
    raise Exception
except Exception, e:
    print 'Exception', e

try:
    raise KeyError('ouch!')
except (IOError, KeyError), e:
    # a bare raise re-raises the last exception
    raise
else:
    # entered if no exception is raised
    pass

try:
    pass
finally:
    print 'a finally block'

Functions

We've already seen the basic syntax for defining functions. Arbitrary number of arguments can be collected with *args syntax (equivalent to the .NET params and collected as a tuple) and an arbitrary number of keyword arguments can be collected as a dictionary with **kwargs syntax.:

def function(*args, **kwargs):
    assert isinstance(args, tuple), 'args is always a tuple'
    assert isinstance(kwargs, dict), 'kwargs is always a dictionary'

assert is a statement - it can be used for runtime design by contract. isinstance is one of Python's built-in functions. If there is no explicit return statement then a function returns None.

*args and **kwargs can be used to call functions with multiple arguments / keyword arguments from a tuple or dictionary:

a = (1, 2, 3)
b = {1: 'one', 2: 'two'}

result = function(*args, **kwargs)

Python also has anonymous functions; lambdas. Lambda functions can take arguments but the body can only be an expression. When the lambda function is called the expression is evaluated and the result returned:

>>> anonymous = lambda arg: arg * 2
>>> anonymous(3)
6

The above lambda function is exactly equivalent to:

def anonymous(arg):
    return arg * 2

Classes

We've also seen the basic syntax for classes. Methods are created using the same syntax as normal function definitions (which is what they are) - but instance methods explicitly take self as the first parameter. self is the equivalent of this in C#, it is the current instance in use, and is passed in automatically as the first parameter:

class ClassName(object):

    def print_self(self):
        print self
>>> first = ClassName()
>>> first.print_self()
<__main__.ClassName object at 0x780d0>
>>> second = ClassName()
>>> second.print_self()
<__main__.ClassName object at 0x780f0>

Python doesn't have explicit access modifiers (no public, protected, etc). You'll be surprised by how little you miss them...

Unlike C# Python doesn't have method overloading. If you need this you can collect arguments with the * and ** syntax and do dynamic dispatch on the type or number of arguments. There are external libraries that use decorators (explained shortly) to implement generic functions, a more general system of overloading.

Class bodies can contain assignment statements. These create class attributes that are shared between all instances. Updating a class attribute will make the change visible to all instances:

class ClassName(object):
    attribute = 3
>>> first = ClassName()
>>> first.attribute
3
>>> second = ClassName()
>>> second.attribute
3
>>> ClassName.attribute = 6
>>> first.attribute
6
>>> second.attribute
6

You can even put arbitrary code in the body of the class. This can be useful for providing different implementations of methods on different platforms, but isn't a very common technique in practise:

import sys

class ClassName(object):

    if sys.platform == 'cli':
        def method(self):
            # implementation for IronPython
    else:
        def method(self):
            # implementation for other platforms

Inheritance works straightforwardly in Python, until you start using multiple inheritance that is:

class BaseClass(object):

    def method(self):
        print 'method on base'

    def other_method(self):
        print 'other_method on base'

class SomeClass(BaseClass):

    def other_method(self):
        print 'other_method on some class'
        BaseClass.other_method(self)
>>> something = SomeClass()
>>> something.method()
method on base
>>> something.other_method()
other_method on some class
other_method on base

The explicit self parameter makes it very easy for inherited methods to call up to the methods they override on a base class.

Multiple inheritance is perfectly valid in Python, but should not be overused. It is most often used to provide mixin functionality.

Python magic methods

We've seen that the constructor for Python classes is the oddly named __init__ method. Methods that start and end with double-underscores (often shortened to 'dunder-method name' for convenience) are special methods, called the 'magic methods'. These implement Python protocols, roughly the equivalent of interfaces in C#, and are called for you by the interpreter rather than being explicitly called by the programmer (usually anyway).

There are lots of different protocol, you can find a good reference to all the Python magic methods in IronPython in Action or online on the book website. If you come across a protocol method that you don't recognise this is the place to turn.

Interfaces are used in C# to specify behavior of objects. For example, if a class implements the IDisposable interface, then you can provide a Dispose method to release resources used by your objects. .NET has a whole host of interfaces, and you can create new ones. If a class implements an interface, it provides a static target for the compiler to call whenever an operation provided by that interface is used in your code.

In Python, you don’t need to provide static targets for the compiler, and you can use the principle of duck typing. Many operations are provided through a kind-of-soft interface mechanism called protocols. This isn’t to say that formal interface specification is decried in Python—how could you use an API if you didn’t know what interface it exposed?—but, again, Python chooses not to enforce this in the language design.

The mapping and sequence protocols use the __getitem__ and __setitem__ methods:

class DataStore(object):
    def __init__(self):
        self._store = {}

    def __getitem__(self, name):
        return self._store[name]

    def __setitem__(self, name, value):
        return self._store[name] = value
>>> store = DataStore()
>>> store['foo'] = 'bar'
>>> store['foo']
'bar'

The consequence of this is that Python programmers are much more interested in the behavior of objects than the type. It is common to see the specification of a function or a method that it takes a mapping type or a sequence type - meaning any object that implements these methods. This is the essence of duck-typing, if you know what methods / properties of objects are used you can provide an alternative implementation. So long as the object quacks like a duck and walks like a duck Python will treat it like a duck...

There are lots of other standard protocol methods for containers, here are a few of them:

class DataStore(object):
    def __init__(self):
        self._store = {}

    def __getitem__(self, name):
        return self._store[name]

    def __setitem__(self, name, value):
        return self._store[name] = value

    def __len__(self):
        # number of elements
        return len(self._store)

    def __nonzero__(self):
        # boolean value
        return bool(self._store)

    def __iter__(self):
        # iteration
        return iter(self._store)

    def __contains__(self, name):
        # membership test
        return name in self._store

Other protocols include the rich comparison methods (__eq__, __lt__ and friends) the numeric methods (__add__, __sub__ and friends) plus a whole host more. Python supports operator overloading, and implementing protocol methods is how you do it.

There are also some special magic methods we can use to customize attribute access, particularly useful for creating fluent interfaces. The three methods are __getattr__ for fetching attributes, __setattr__ for setting them and __delattr__ for deleting them. Here's an example using __getattr__ to build up messages:

class Fluid(object):

    def __init__(self):
        self._message = []

    def __getattr__(self, name):
        self._message.append(name)
        return self

    def __str__(self):
        # using the join method on string
        return ' '.join(self._message).strip()
>>> f = Fluid()
>>> f.hello.everyone.welcome.to.Python
<__main__.Fluid object at 0x782b0>
>>> str(f)
'Hello everyone welcome to Python'

You've probably already used an API built in a similar way in Javascript, where you an traverse the DOM as attributes on document. Creating APIs like this is very easy in dynamic languages. __getattr__ and friends have some complexity, so it is worth reading up on them if you want to use them. They're covered in IronPython in Action.

Although these methods allow us to implement custom behavior for attribute access, they aren't a replacement for properties which we'll look at next.

Properties & decorators

We haven't yet looked at properties in Python. Instead of having first class syntax for properties Python uses the 'descriptor protocol', along with normal Python syntax, to provide class methods, static methods and properties. The descriptor protocol is considered to be fairly deep Python 'magic'. It's actually fairly easy to understand but beyond the scope of this article. Let's look at how we use the built-in classmethod, staticmethod and property descriptors.

The easiest way to use these descriptors are as decorators. Decorators are a way of transforming functions and methods and are nominally similar to .NET attributes or Java annotations. They work due to the way that functions are first class objects in Python and are examples of higher order functions (functions that receive functions as arguments or return functions).

The syntax to create a static method in Python is:

class Static(object):

    @staticmethod
    def static_one():
        return 1

    @staticmethod
    def static_two():
        return 2
>>> Static.static_one()
1
>>> Static.static_two()
2

A class method is a method that receives the class as the first argument instead of the instance. They are often used to create alternative constructors. There isn't much need for static methods in Python.

staticmethod behaves like a function (it is actually a type), it wraps the methods. The @ syntax is pure syntactic sugar. The following two snippets of code are identical:

@decorator
def function():
    pass

def function():
    pass
function = decorator(function)

The decorator is called with the function it wraps as an argument. The function name is bound to whatever the decorator returns. Here's a decorator that checks arguments for null values (None):

def checkarguments(function):
    def decorated(*args):
        if None in args:
            raise TypeError("Invalid Argument")
        return function(*args)
    return decorated

class MyClass(object):

    @checkarguments
    def method(self, arg1, arg2):
        return arg1 + arg2
>>> instance = MyClass()
>>> instance.method(1, 2)
3
>>> instance.method(2, None)
Traceback (most recent call last):
 ...
TypeError: Invalid Argument

The checkarguments decorator takes a function as the argument. It creates a new inner function, which it returns. When this function is called it checks all the arguments and then calls the original function, which it still has a reference to through the closure. It uses the *args syntax to collect all the arguments the method is called with and then call the original method with the same arguments.

Python 2.6 introduces class decorators in addition to function / method decorators. They also wrap functions and can be used for many of the same purposes as metaclasses (for checking or transforming classes).

We can use property as a decorator to create get only properties:

class SomeClass(object):

    @property
    def three(self):
        print 'Three fetched'
        return 3
>>> something = SomeClass()
>>> something.three
Three fetched
3

The old way of creating get and set properties (in Python you can also use properties to customize deletion but it is there for symmetry and not used very often) is less attractive. This is one area where the C# syntax is nicer than Python:

class SomeClass(object):

    def get_three(self):
        print 'Three fetched'
        return 3

    def set_three(self, value):
        if value != 3:
            raise ValueError('Three has to be equal to 3!')

    three = property(get_three, set_three)
>>> something = SomeClass()
>>> something.three
Three fetched
3
>>> something.three = 4
Traceback (most recent call last):
 ...
ValueError: Three has to be equal to 3!

Python 2.6 introduces a new technique that is slightly better looking:

class SomeClass(object):

    @property
    def three(self):
        return 3

    @three.setter
    def three(self, value):
        if value != 3:
            raise ValueError('Three has to be equal to 3!')

Still not as nice as C#, but better...

Modules and Packages

The last thing you want when programming is to have all your code contained in a single monolithic file. This makes it almost impossible to find anything. Ideally, you want to break your program down into small files containing only closely related classes or functionality. In Python, these are called modules.

Note

A module is a Python source file (a text file) whose name ends with .py. Objects (names) defined in a module can be imported and used elsewhere. They’re very different from .NET modules, which are partitions of assemblies.

The import statement has several different forms.

import module
from module import name1, name2
from module import name as anotherName
from module import *

Importing a module executes the code it contains and creates a module object. The names you’ve specified are then available from where you imported them.

If you use the first form, you receive a reference to the module object. Needless to say, these are first-class objects that you can pass around and access attributes on (including setting and deleting attributes). If a module defines a class SomeClass, then you can access it using module.SomeClass.

If you need access to only a few objects from the module, you can use the second form. It imports only the names you’ve specified from the module.

If a name you wish to import would clash with a name in your current namespace, you can use the third form. This imports the object you specify, but binds it to an alternative name.

The fourth form is the closest to the C# using directive. It imports all the names (except ones that start with an underscore) from the module into your namespace. In Python, this is generally frowned on. You may import names that clash with other names you’re using without realizing it; when reading your code, it’s not possible to see where names are defined.

Python allows you to group related modules together as a package. The structure of a Python package, with subpackages, is shown in the image below.

Python packages

Note

A package is a directory containing Python files and a file called __init__.py. A package can contain sub- packages (directories), which also have an __init__.py. Directories and subdirectories must have names that are valid Python identifiers.

A package is a directory on the Python search path. Importing anything from the package will execute __init__.py and insert the resulting module into sys.modules under the package name. You can use __init__.py to customize what importing the package does, but it’s also common to leave it as an empty file and expose the package functionality via the modules in the package.

You import a module from a package using dot syntax.

import package.module
from package import module

Packages themselves may contain packages; these are subpackages. To access subpackages, you just need to use a few more dots.

import package.subpackage.module
from package.subpackage import module

Python also contains several built-in modules. You still need to import these to have access to them, but no code is executed when you do the import. We mention these because one of them is very important to understanding imports. This is the sys module.

When you import a module, the first thing that Python does is look inside sys.modules to see if the module has already been imported. sys.modules is a dictionary, keyed by module name, containing the module objects. If the module is already in sys.modules, then it will be fetched from there rather than re-executed. Importing a module (or name) from different places will always give you a reference to the same object.

If the module hasn’t been imported yet, Python searches its path to look for a file named module.py. If it finds a Python file corresponding to the import, Python executes the file and creates the module object. If the module isn’t found, then an ImportError is raised.

As well as searching for a corresponding Python file, IronPython looks for a package directory, a built-in module, or .NET classes. You can even add import hooks to further customize the way imports work.

The list of paths that Python searches is stored in sys.path. This is a list of strings that always includes the directory of the main script that’s running. You can add (or remove) paths from this list if you want.

Some Python files can be used both as libraries, to be imported from, and as scripts that provide functionality when they’re executed directly. For example, consider a library that provides routines for converting files from one format to another. Programs may wish to import these functions and classes for use within an application, but the library itself might be capable of acting as a command-line utility for converting files.

In this case, the code needs to know whether it’s running as the main script or has been imported from somewhere else. You can do this by checking the value of the variable __name__. This is normally set to the current module name unless the script is running as the main script, in which case its name will be __main__.

def main():
    "docstring"
    # code to execute functionality
    # when run as a script

if __name__ == '__main__':
    main()

This segment of code will only call the function main if run as the main script and not if imported.

Other language features

Python has lots of other language features that make it a pleasure to work with. These features include:

  • tuple unpacking

    a, b = (1, 2)
    a, b = get_tuple()
    
    for a, b, (c, d) in some_iterator:
        pass
    
  • list comprehensions and generator expressions. These allow you to combine a loop and a filter in a single expression (similar to LINQ over objects).

    >>> # list comprehensions are eager
    >>> a = [value ** 2 for value in some_iterator if value > minimum]
    >>> # generator expressions are lazy
    >>> a = (value ** 2 for value in some_iterator if value > minimum)
    >>> a
    <generator object at 0x77be8>
    
  • iterators and generators (iterators are implemented with the __iter__ and next protocol methods whilst Python's yield is similar to C#'s Yield Return but with added capabilities).

  • the with statement (similar to the C# using statement but able to detect and optionally handle exceptional exits)

  • ternary expressions (unlike other languages the expression in the middle is evaluated first. If this evaluates to True then the left hand expression is evaluated and returned otherwise the right hand expression is evaluated and returned.)

    a = 1 if x > 3 else None
    

Programming Paradigms

Scripting

Python is sometimes categorized as a 'scripting language', implying it is only suitable for scripting tasks. Whilst that certainly isn't true Python does make an excellent scripting language.

If you want to write a script to automate a regular task, you aren’t forced to write an object-oriented application; you aren’t even forced to write functions if the task at hand doesn’t call for them. This next listing is a script for a typical admin task of clearing out the temp folder of files that haven’t been modified for more than seven days.

import os, stat
from datetime import datetime, timedelta

tempdir = os.environ["TEMP"]
max_age = datetime.now() - timedelta(7)

for filename in os.listdir(tempdir):
    path = os.path.join(tempdir, filename)
    if os.path.isdir(path):
        continue
    date_stamp = os.stat(path).st_mtime
    mtime = datetime.fromtimestamp(date_stamp)
    if mtime < max_age:
        mode = os.stat(path).st_mode
        os.chmod(path, mode | stat.S_IWRITE)
        os.remove(path)

Python has a rich tradition of being used for shell scripting, particularly on the Linux platform.

Procedural

The code above works fine, but it runs whenever the script is executed and so isn't reusable. We can make it more useful by refactoring it into functions. If we have an if __name__ == '__main__' block then the Python file retains the same behavior when executed as a script but also behaves as a module that can be imported.

import os, stat
from datetime import datetime, timedelta

tempdir = os.environ["TEMP"]
max_age = datetime.now() - timedelta(7)


def delete_old_files_in_directory(directory):
    for filename in os.listdir(directory):
        path = os.path.join(directory, filename)
        if os.path.isdir(path):
            continue
        delete_old_file(path)

def delete_old_file(path, max_age=max_age):
    date_stamp = os.stat(path).st_mtime
    mtime = datetime.fromtimestamp(date_stamp)
    if mtime < max_age:
        mode = os.stat(path).st_mode
        os.chmod(path, mode | stat.S_IWRITE)
        os.remove(path)


if __name__ == '__main__':
    delete_old_files_in_directory(tempdir)

Functional

Functions are first class objects. First class functions, in combination with closures, make higher-order functions (functions that take or return functions) common in Python.

Closures are a fundamental concept to functional programming. A closure is basically a scope. Functions have access to all the variables in their enclosing scope. When you create a function it is said to 'close over' all the variables in its scope that it uses. Here's a simple closure:

>>> def f():
...     a = 1
...     def inner():
...         print a
...     return inner
...
>>> function = f()
>>> function()
1

The inner function has access to ('closes over') the variable a defined in its enclosing scope. When f is called it returns the inner function. When the inner function is called it prints the value of the variable a from the scope it was defined in.

Parameters that are passed into a function become local variables within the scope of the function. We can use this to create function factories based on the parameters we pass in. Every time a function is called a new scope is created, so our factories can be called multiple times with different values.

def makeAdder(x):
    def adder(y):
        return x + y
    return adder
>>> add3 = makeAdder(3)
>>> add3(5)
8
>>> add2 = makeAdder(2)
>>> add2(2)
4

In makeAdder we bind the inner function to the argument (x) passed in. makeAdder returns a new function that takes a single argument (y). When this new function is called it returns the result of adding the new argument to the original value of x when it was created.

The generalisation of this is called partial application. In the next example the function partial takes a function that takes two arguments (as its first argument - func) and the first argument (x). partial returns a new function with its first argument bound to it.

The returned function (called inner) takes one argument, and when called it calls the original function with its first argument and the new one. This sounds more complicated than it is. We can rewrite makeAdder to use it:

def add(x, y):
    return x + y

def partial(func, x):
    def inner(y):
        return func(x, y)
    return inner
>>> add2 = partial(add, 2)
>>> add2(3)
5
>>> add1 = partial(add, 1)
>>> add1(1)
2

These are examples of the common pattern, the 'factory function'; functions that return a function specialised on the input parameters passed to them.

An extension of this is the 'class factory', which is a function that returns a class specialized on its input. We saw an example of this from the Python standard library in the Introduction to IronPython article - in the form of namedtuple.

Metaprogramming

The most basic form of metaprogramming is code-generation and execution at runtime, which in Python means exec and eval. eval is for evaluating expressions and returning a result:

>>> a = eval("1 + 2")
>>> a
3

For executing statements we can use the exec statement. We can use a dictionary as the namespace the code is executed in. If the code creates variables, functions or classes then they will be accessible in the dictionary after execing:

>>> namespace = {}
>>> code = """
... def function():
...     print 'w00t!'
... a = 3"""
>>>
>>> exec code in namespace
>>> namespace['a']
3
>>> function = namespace['function']
>>> function()
w00t!
>>>

Code generation is a bit of a blunt instrument when it comes to metaprogramming. A more common way of metaprogramming in Python is with metaclasses.

Metaclasses are a particularly interesting feature of Python. Classes are first class objects in Python, and like all objects they have a type. Classes are instances of their metaclass, which defines some of the ways they behave. Metaclasses are seen as advanced Python but although the usecases for implementing them yourself are rare the mechanisms involved are easy to understand. For a good introduction to metaclasses read: Metaclasses in five minutes.

What Next?

Python is a full programming language and although it is very easy to learn the basics it can take time to become an idiomatic Python programmer. Fortunately there are many free online resources to help. Here are a few of the best ones:

The next article in this series is about choosing an IDE or editor for developing with IronPython and some of the standard tools (like debuggers, code quality checkers and so on) that are available to you:

For tutorials and examples of working with IronPython, try these resources:


Note

The early part of this tutorial draws from Pumping Iron, a presentation on IronPython originally by Harry Pierson and expanded on by Jonathan Hartley.

For buying techie books, science fiction, computer hardware or the latest gadgets: visit The Voidspace Amazon Store.

Hosted by Webfaction

Return to Top

Page rendered with rest2web the Site Builder

Last edited Fri Nov 27 18:32:35 2009.

Counter...