Introduction to unittest

Starting Testing with Python

Note

In Python 2.7 and 3.2 a whole bunch of improvements to unittest will arrive. The major changes include new assert methods, clean up functions, assertRaises as a context manager, new command line features, test discovery and the load_tests protocol. unittest2 is a backport of the new features (and tests) to work with Python 2.4, 2.5 & 2.6. See: unittest2: improvements to the unittest module

Introduction

As a dynamic language Python is substantially easier to test than other languages, which means that there is absolutely no excuse for not having good tests for your Python projects.

This article is about testing in Python, mainly using the Python standard library testing framework unittest. Testing is an important subject in the Python world and there are a huge number of different testing libraries and tools available. You can find a good collection of some of the popular libraries in the Python Testing Tool Taxonomy.

Testing

There are many different ways of categorizing tests, and many names for subtly different styles of testing. Broadly speaking, the three categories of tests are as follow:

  • Unit tests
  • Functional tests (black box tests, acceptance tests, integration tests)
  • Regression tests

Unit tests are for testing components of your code, usually individual classes or functions. The elements under test should be testable in isolation from other parts, which means eliminating dependencies. It isn't always obvious how to do this, but there are ways of handling these dependencies within your tests. Dependencies that particularly need to be managed include cases where your tests need to access external resources like databases or the filesystem. As well as looking at the basics of setting up a test framework we'll also be looking at some of the techniques you can use to control dependencies (mock objects).

Functional tests are higher-level tests that drive your application from the outside. This can be done with automation tools, by your test framework, or by providing hooks within your application. Functional tests mimic user actions and test that specific input produces the right output. As well as testing the individual units of code that the tests exercise, they also check that all the parts are wired together correctly - something that unit testing alone doesn’t achieve. For some ideas about functional testing techniques with Python see my article Functional Testing of GUI Applications.

Regression testing checks that bugs you’ve fixed don’t recur. Regression tests are basically unit tests, but your motivation for writing them is different. Once you’ve identified and fixed a bug, the regression test guarantees that it doesn’t come back.

The easiest way to start with testing in Python is to use the standard library module unittest.

The unittest Module

unittest has its origins in the Java testing framework JUnit, which itself was a port of the Smalltalk testing framework SUnit created by Kent Beck. As it is part of the xUnit family unittest is sometimes known as pyUnit.

unittest is an object oriented framework based around test fixtures. In unittest the test fixture is the TestCase class, and its basic usage is very simple:

import unittest

class MyTest(unittest.TestCase):

    def testMethod(self):
        self.assertEqual(1 + 2, 3, "1 + 2 not equal to 3")


if __name__ == '__main__':
    unittest.main()

You create a new test fixture by subclassing TestCase and defining test methods whose names start with test. The test methods perform actions with your production classes and call assert methods to verify the expected behavior and results.

In large test frameworks it is common to subclass TestCase and provide methods useful for testing your specific project. Your test modules will then subclass your custom test fixture rather than directly inheriting from unittest.TestCase.

The block at the end of the example above calls unittest.main() when the test module is executed directly from the command line:

python test_something.py

This executes all the tests in the test module reporting failures and errors. Test passes are shown as a '.', failures with an 'F' and errors with an 'E'.

The output of running unittests from the command line

To run a test suite consisting of several test modules they must be collected before they can be run.

Loaders, runners and all that stuff

There are a whole bunch of other classes in unittest; test runners, test loaders, test suites, test results and all that stuff. My book IronPython in Action has a more detailed description of how they wire together.

Code to execute tests in multiple test modules might look like this:

import unittest

import test_something
import test_something2
import test_something3

loader = unittest.TestLoader()

suite = loader.loadTestsFromModule(test_something)
suite.addTests(loader.loadTestsFromModule(test_something2))
suite.addTests(loader.loadTestsFromModule(test_something3))

runner = unittest.TextTestRunner(verbosity=2)
result = runner.run(suite)

The code above imports all the test modules separately (test_something, test_something2...) and turns them into a test suite before executing them with a runner. The image below shows the interactions between the classes.

Class relationship diagram from IronPython in Action

All of these classes can be subclassed to customize their behavior. Fortunately there is also a simpler way of collecting and running all the tests in a project.

Automatic test discovery

An easier way of running all the tests in a project is to use automatic test discovery. This is a feature that has been in alternative Python testing frameworks, such as nose and py.test, for a long time. Test discovery has finally been added to unittest in what will become Python 2.7 and Python 3.2. The test discovery has been backported as a separate module that can be used with Python 2.4 or more recent, including IronPython.

When you run discover.py from the command line it searches from the current directory, recursing into Python packages that it finds, running all the test modules that it is able to import. The basic way of running test discovery is from the command line with the current directory at the top level of the project:

python discover.py

Or if the discover module is on your default module path (either in a directory pointed to by the PYTHONPATH environment variable or a directory added to the path by site.py) then you can execute it with:

python -m discover

discover identifies test modules as importable files (inside a Python package) matching the pattern 'test*.py'. You can configure the pattern, and options like the directory discover starts its search in, with command line parameters:

> python -m discover -h
Usage: discover.py [options]

Options:
  -h, --help            show this help message and exit
  -v, --verbose         Verbose output
  -s START, --start-directory=START
                        Directory to start discovery ('.' default)
  -p PATTERN, --pattern=PATTERN
                        Pattern to match tests ('test*.py' default)
  -t TOP, --top-level-directory=TOP
                        Top level directory of project (defaults to start
                        directory)

If you want to build a more complex test framework, perhaps with a custom test runner that pushes results to a database, you can still use discovery by importing and using the DiscoveringTestLoader which is a subclass of the standard unittest TestLoader.

The assert methods

We've looked at some of the plumbing behind creating a test framework for a project, let's look at the different assert methods available on the TestCase class. These methods allow you to make different kinds of assertions about the behaviour of your objects.

The four most common assert methods come in two pairs, with a positive and a negative variant: assertTrue and assertFalse, assertEqual and assertNotEqual.

import unittest

from mymodule import MyClass

class MyTest(unittest.TestCase):

    def testTrue(self):
        myclass = MyClass()

        try:
            result = myclass.method()
            self.asssertTrue(result)

        finally:
            myclass.close()


    def testFalse(self):
        myclass = MyClass()

        try:
            result = myclass.anotherMethod()
            self.asssertFalse(result)

        finally:
            myclass.close()


    def testEqual(self):
        myclass = MyClass()

        try:
            first = myclass.methodOne()
            second = myclass.methodTwo()

            self.assertEqual(first, second)

        finally:
            myclass.close()


    def testNotEqual(self):
        myclass = MyClass()

        try:
            first = myclass.methodOne()
            third = myclass.methodThree()

            self.assertNotEqual(first, third)

        finally:
            myclass.close()

These methods should all be self-explanatory. They provide the basic building block for you to build your test infrastructure on. In addition to these four asserts there are three additional ones. The first two of these are assertAlmostEqual and assertNotAlmostEqual for comparing floats. You specify the number of decimal places to compare them to:

first = 3.1
second = 3.2

# this will pass
self.assertAlmostEqual(first, second, 0)

# this will fail
self.assertAlmostEqual(first, second, 1)

In practise I don't find number of decimal places to be a fine enough granularity for comparison. Inevitably when comparing floats I actually compare against a delta:

first = 3.1
second = 3.2
delta = 0.2

difference = abs(first - second)
self.assertTrue(difference < delta,
                "difference: %s is not less than %s" % (difference, delta))

The final assert method is a bit more useful. Often when testing an API you need to test how it behaves under error conditions, for example you may want to test that given invalid input a method raises a specific type of exception. The assertRaises method is how we test for this. It takes an exception type as the first argument, followed by a callable (usually a function) along with any arguments it takes. The assert methods calls the function and the assert passes if an exception of the correct type is raised. If an exception is not raised the assert fails and if an non-matching exception is raised then it is not caught and the test fails with an error.

def adder(a, b):
    return a + b

self.assertRaises(TypeError, adder, 33, 'a string')

In the version of unittest that will be in Python 2.7 / 3.2 several new and useful assert methods have been added. The Python documentation for these versions has the details.

The assert statement

unittest assertions are based on the Python assert statement. The assert statement takes an expression and raises an AssertionError if the expression evaluates to False. The assert in one of the examples from above could be written as:

assert 1 + 2 == 3, "1 + 2 not equal to 3"

There are two reasons we use assert methods rather than plain asserts in our tests. The assert will fail if the expression evaluates to False, but the only error message we get is the message we provide to the assert statement:

>>> assert a == b, "a != b"
Traceback (most recent call last):
  ...
AssertionError: a != b

If we use the assert methods then the failure message can include more useful information, especially about the objects being compared. The second reason to use assert methods is that assert statements are disabled (not executed) when Python is run with the -O or -OO command line arguments (optimized mode). assert statements can be used to verify conditions in your code; runtime design by contract.

setUp and tearDown

I'm sure you noticed that in the earlier examples all the test methods had some code in common. They both instantiated MyClass and closed the instance when the test completed. This not only violates DRY (Don't Repeat Yourself) but is tedious and error prone (bad things may happen if you forget to close the instance). Those of you used to testing frameworks in other languages will not be surprised to hear that unittest has methods called setUp and tearDown to deal with these situations.

If your test cases define a setUp method it will be called before every test. If there is an exception raised in the setUp then the appropriate error or failure will be recorded and the test method will not be run.

setUp is particularly useful for functional / integration tests where setting up fixtures means establishing a lot of state. In the case of our earlier tests we can rewrite using setUp:

import unittest

from mymodule import MyClass

class MyTest(unittest.TestCase):

    def setUp(self):
        unittest.TestCase.setUp(self)

        self.myclass = MyClass()


    def testTrue(self):
        try:
            result = self.myclass.method()

            self.asssertTrue(result)
        finally:
            myclass.close()

    ...

In setUp here MyClass is instantiated and stored as an instance variable on the TestCase instance. The test can access the instance created by setUp instead of having to create it itself.

The setUp method in TestCase does nothing (at least in the current versions of unittest) so strictly speaking it isn't necessary to call up to the parent class. When you inherit from a custom TestCase you will need to call up to the parent method so it is a good habit to get into.

setUp has a corresponding method that is called after the test has run, tearDown. Like setUp, if an exception is raised in tearDown then the appropriate error or failure will be recorded for the test. Currently our tests still have to close the MyClass instance after use, we can use tearDown to fix this:

import unittest

from mymodule import MyClass

class MyTest(unittest.TestCase):

    def setUp(self):
        unittest.TestCase.setUp(self)

        self.myclass = MyClass()

    def tearDown(self):
        unittest.TestCase.tearDown(self)

        self.myclass.close()


    def testTrue(self):
        result = self.myclass.method()
        self.asssertTrue(result)

    ...

See how using tearDown also simplifies our test. As tearDown is executed even if the test fails or an error occurs we no longer need to use a try: ... finally: to ensure the MyClass instance is closed.

We've now covered the major points of working with unittest itself, let's look at some general Python testing techniques.

Duck typing and mock objects

One of the reasons that Python is so much easier to test than statically typed languages is that because of the wonders of duck typing we can substitute any object at runtime for another object that supports the same operations. This means we can swap out production classes with mock objects that record how they are used.

Let's look at how we might test this code in Python:

class MyClass(object):
    def __init__(self):
        self.data = None

    def readData(self, source):
        self.data = source.read()
        source.close()

MyClass has a readData method that takes a data source, reads from it and then closes it. A real data source may be expensive (slow) to create, and in any case we want to test MyClass in isolation.

We can create a mock data source that has the methods MyClass uses. Our tests can use the mock data source so that we can check MyClass uses it as it should. This is especially useful when using a real data source may be slow or uses an external resource like a database:

class MockDataSource(object):
    def __init__(self):
        self.readFrom = False
        self.closed = False

    def read(self):
        self.readFrom = True
        return 'some data'

    def close(self):
        self.closed = True

The read method of our mock data source returns some known data when called. It also records that it has been read from, by setting readFrom to True, and when close has been called.

Using the MockDataSource to test MyClass:

import unittest

from mymodule import MyClass

def TestMyClass(unittest.TestCase):

    def testConstructor(self):
        "Test the default state"
        myclass = MyClass()

        self.assertEqual(myclass.data, None)

    def testReadData(self):
        myclass = MyClass()

        source = MockDataSource()

        myclass.readData(source)

        self.assertEqual(myclass.data, 'some data')
        self.assertTrue(source.readFrom)
        self.assertTrue(source.closed)

Constructing custom mocks for all the production classes you need to test can be time consuming and painful. Fortunately we can make this easier by using one of the many Python mocking libraries that are available. My favourite is mock, which by coincidence I wrote and is particularly suited for use with unittest.

The main class in the mock library is Mock. Mock automatically creates methods and attributes as they are accessed and records how they are used. Mock instances have several useful methods and attributes to control their behavior and make assertions about how they have been used. We can rewrite our test above to use Mock:

import unittest
from mock import Mock

from mymodule import MyClass

def TestMyClass(unittest.TestCase):

    def testConstructor(self):
        "Test the default state"
        myclass = MyClass()

        self.assertEqual(myclass.data, None)

    def testReadData(self):
        myclass = MyClass()

        source = Mock()
        source.read.return_value = 'some data'

        myclass.readData(source)

        self.assertEqual(myclass.data, 'some data')
        self.assertTrue(source.read.called)
        self.assertTrue(source.close.called)

The line source.read.return_value = 'some data' automatically creates the read method on our mock data source merely by accessing it and then sets the return_value to be 'some data'. The newly created read method is actually a new Mock instance. As Mock instances are callable they can behave just like methods of objects. Setting the return_value controls what is returned when the mock is called.

If the mock is called with arguments you can use the assert_called_with method to verify that it has been called with the expected arguments. A quick run down of some of the useful members on Mock objects:

>>> from mock import Mock
>>> mock = Mock()
>>> mock.method.return_value = 'foo'
>>>
>>> mock.method(1, 2, 3, 4)
'foo'
>>> mock.method.called
True
>>> mock.method.assert_called_with(8, 6)
Traceback (most recent call last):
  ...
AssertionError: Expected: ((8, 6), {})
Called with: ((1, 2, 3, 4), {})
>>>

Mock objects can even raise exception or have other side effects when called:

>>> mock = Mock()
>>> mock.side_effect = Exception('Boom!')
>>> mock()
Traceback (most recent call last):
  ...
Exception: Boom!

>>> results = [1, 2, 3]
>>> def side_effect(*args, **kwargs):
...     return results.pop()
...
>>> mock.side_effect = side_effect
>>> mock(), mock(), mock()
(3, 2, 1)

There's lots more to mock so it is worth perusing the documentation. As well as the Mock class it has useful decorators for automatic monkey patching, which is another powerful testing technique.

Monkey patching

Monkey patching is a term that originated in the Python community to describe runtime modification (patching) of live objects. This can include replacing methods with a completely new implementation. This is generally regarded as being a bad thing to do in production code but is very useful for testing.

Note

Monkey patching is a term that started with the Python community but is now widely used (especially within the Ruby community). It seems to have originated with Zope programmers, who referred to guerilla patching. This evolved from gorilla patching into monkey patching.

We can illustrate this with some new methods on MyClass.

class MyClass(object):
    def __init__(self):
        self.data = None

    def readData(self, source):
        self.data = source.read()
        source.close()

    def synchronise(self):
        source = self.getDataSource()
        self.readData(source)
        self.store()

The new method synchronise on MyClass fetches a data source, reads the data and then stores it. synchronise calls readData, which we have already worked with, and getDataSource and store which for convenience aren't shown. We can test the synchronise method in isolation by monkey patching the three methods that it uses.

Note

In Python we can patch classes as well as instances. If we patch an instance then the changes only affect that instance but changes to classes persist. If you patch a class in a test then you have to be very careful to restore the class to its original state or your changes will 'leak' and could affect future tests.

We can reuse the Mock class we have been working with to replace the methods that synchronise calls.

import unittest
from mock import Mock, sentinel

from mymodule import MyClass

def TestMyClass(unittest.TestCase):

    def testSynchronise(self):
        myclass = MyClass()

        # put the monkey patching in place
        myclass.getDataSource = Mock()
        myclass.getDataSource.return_value = sentinel.DataSource

        myclass.readData = Mock()
        myclass.store = Mock()

        # make the call
        myclass.synchronise()

        # assertions
        self.assertTrue(myclass.getDataSource.called)
        myclass.readData.assert_called_with(sentinel.DataSource)
        self.assertTrue(myclass.store.called)

As well as Mock this test uses another object provided by the mock module. sentinel is another object that creates attributes on demand. Every time you access the same attribute it returns the same object, so we can use sentinel to provide known values for our tests. It makes for nice readable tests when we should need some value that we can test against. In this test we check that readData is called with the return value of getDataSource, sentinel.DataSource.

That method was easy to test, but methods that use external classes can be harder to test. With Python as well as patching instances we can patch objects at the module level. Because name lookup is done at runtime we can replace the implementation of an external class with another mock object. As with directly patching classes any changes you make to modules will persist so you have to be extremely careful about undoing any changes you make. The mock module has decorators for tests that can handle doing the patching and automatically undoing it once the test has completed.

Let's have a look at a potential implementation for the getDataSource method that synchronise calls.

from datasource import DataSource

class MyClass(object):

    def getDataSource(self):
        return DataSource()

It's a trivial piece of code but it could be very hard to test, especially if creating the DataSource is expensive or connects to external resources that may not be available in a test environment. We can test it by patching out the DataSource name in mymodule. When getDataSource is called the DataSource name will be looked up in the module namespace. If we have replaced it with an alternative implementation then that will be used instead. It is important to realise that we are patching the namespace where DataSource is used, which in our case is mymodule, and not patching the place where DataSource is defined.

The mock module provides a patch decorator that will do the patching for us, and as an added bonus it will patch it with a Mock object and pass the mock into our test method. Here is a simple test for getDataSource using the patch decorator, the mock created by the patch decorator is the extra parameter (MockDataSource) to the testSynchronise method:

import unittest
from mock import Mock, patch, sentinel

from mymodule import MyClass

class TestMyClass(unittest.TestCase):

    @patch('mymodule.DataSource')
    def testSychronise(self, MockDataSource):
        MockDataSource.return_value = sentinel.DataSource

        myclass = MyClass()
        source = myclass.getDataSource()

        self.assertEquals(source, sentinel.DataSource)

Instantiating a class is done by calling it (instantiation is actually done by the __call__ method of the class's metaclass - so instantiation is calling the class). As DataSource is instantiated inside the getDataSource call we control what is returned by setting the return value on the MockDataSource. getDataSource just returns the instance it creates, so we test that the return value of calling this method is the same object we set as the MockDataSource return value.

But there’s a potential problem with over using monkey patching. Your tests become whitebox tests that know a great deal about the implementation of the objects under test. A pattern that can help reduce this coupling is dependency injection.

Dependency Injection

Note

This example of dependency injection for testing is based on a presentation by Alex Martelli: Python Dependency Injection.

In dependency injection, dependencies are supplied to components rather than being used directly. Dependency injection makes testing easier, because you can supply mocks instead of the real dependencies and test that they’re used as expected. A common way to do dependency injection in Python is to provide dependencies as default arguments in object constructors.

Let’s look at testing a simple scheduler class to see how this works.

import time

class Scheduler(object):
    def __init__(self, tm=time.time, sl=time.sleep):
        self.time = tm
        self.sleep = sl

    def schedule(self, when, function):
        self.sleep(when - self.time())
        return function()

Scheduler has a single method, schedule, that takes a callable and a time for the callable to be fired. The schedule method blocks by sleeping until the correct time (when) using the time.time and time.sleep standard library functions; but, because it obtains them with dependency injection, it’s easy to test. The injection is set up in the Scheduler constructor, so the first thing you need to test is that the default constructor does the right thing. Setting up the default dependency in the constructor is the extra layer that dependency injection introduces into your code.

import time
from unittest import TestCase
from dependency_injection import Scheduler

class DependencyInjectionTest(TestCase):
    def testConstructor(self):
        scheduler = Scheduler()
        self.assertEquals(scheduler.time, time.time,
                          "time not initialized correctly")
        self.assertEquals(scheduler.sleep, time.sleep,
                          "sleep not initialized correctly")

Having tested that the dependency injection is properly initialized in the default case, you can use it to test the schedule method. The listing below uses mocks passed into the constructor instead of the defaults. You can then assert that calls are made with the right arguments, and that schedule returns the right result.

def testSchedule(self):
    mock = Mock()
    mock.time.return_value = 100

    scheduler = Scheduler(mock.time, mock.sleep)

    mock.return_value = 'foo'

    result = scheduler.schedule(300, mock)

    self.assertEquals(result, 'foo',
                      "schedule did not return result of calling function")

    self.assertTrue(mock.time.called)
    self.assertTrue(mock.called)
    mock.sleep.assert_called_with(200)

Because the mock time function is set up to return 100, and schedule is called with 300 as it's when argument, sleep should be called with 200. Dependency injection can easily be done using setter properties, or even with simple attributes. Yet another approach is to use factory methods or functions for providing dependencies, which can be needed where fresh instances of dependencies are required for each use. Dependency injection is useful for unit testing, but can also be a useful architecture where subclasses override the behavior of their base classes.

For buying techie books, science fiction, computer hardware or the latest gadgets: visit The Voidspace Amazon Store.

Hosted by Webfaction

Return to Top

Page rendered with rest2web the Site Builder

Last edited Tue Aug 2 00:51:34 2011.

Counter...