Chapter 14. Testing, Debugging, and Exceptions

Testing rocks, but debugging? Not so much. The fact that there’s no compiler to analyze your code before Python executes it makes testing a critical part of development. The goal of this chapter is to discuss some common problems related to testing, debugging, and exception handling. It is not meant to be a gentle introduction to test-driven development or the unittest module. Thus, some familiarity with testing concepts is assumed.

patch() works by taking an existing object with the fully qualified name that you provide and replacing it with a new value. The original value is then restored after the completion of the decorated function or context manager. By default, values are replaced with MagicMock instances. For example:

>>> x = 42
>>> with patch('__main__.x'):
...     print(x)
...
<MagicMock name='x' id='4314230032'>
>>> x
42
>>>

However, you can actually replace the value with anything that you wish by supplying it as a second argument to patch():

>>> x
42
>>> with patch('__main__.x', 'patched_value'):
...     print(x)
...
patched_value
>>> x
42
>>>

The MagicMock instances that are normally used as replacement values are meant to mimic callables and instances. They record information about usage and allow you to make assertions. For example:

>>> from unittest.mock import MagicMock
>>> m = MagicMock(return_value = 10)
>>> m(1, 2, debug=True)
10
>>> m.assert_called_with(1, 2, debug=True)
>>> m.assert_called_with(1, 2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../unittest/mock.py", line 726, in assert_called_with
    raise AssertionError(msg)
AssertionError: Expected call: mock(1, 2)
Actual call: mock(1, 2, debug=True)
>>>

>>> m.upper.return_value = 'HELLO'
>>> m.upper('hello')
'HELLO'
>>> assert m.upper.called

>>> m.split.return_value = ['hello', 'world']
>>> m.split('hello world')
['hello', 'world']
>>> m.split.assert_called_with('hello world')
>>>

>>> m['blah']
<MagicMock name='mock.__getitem__()' id='4314412048'>
>>> m.__getitem__.called
True
>>> m.__getitem__.assert_called_with('blah')
>>>

Typically, these kinds of operations are carried out in a unit test. For example, suppose you have some function like this:

# example.py
from urllib.request import urlopen
import csv

def dowprices():
    u = urlopen('http://finance.yahoo.com/d/quotes.csv?s=@^DJI&f=sl1')
    lines = (line.decode('utf-8') for line in u)
    rows = (row for row in csv.reader(lines) if len(row) == 2)
    prices = { name:float(price) for name, price in rows }
    return prices

Normally, this function uses urlopen() to go fetch data off the Web and parse it. To unit test it, you might want to give it a more predictable dataset of your own creation, however. Here’s an example using patching:

import unittest
from unittest.mock import patch
import io
import example

sample_data = io.BytesIO(b'''\
"IBM",91.1\r
"AA",13.25\r
"MSFT",27.72\r
\r
''')

class Tests(unittest.TestCase):
    @patch('example.urlopen', return_value=sample_data)
    def test_dowprices(self, mock_urlopen):
        p = example.dowprices()
        self.assertTrue(mock_urlopen.called)
        self.assertEqual(p,
                         {'IBM': 91.1,
                          'AA': 13.25,
                          'MSFT' : 27.72})

if __name__ == '__main__':
    unittest.main()

In this example, the urlopen() function in the example module is replaced with a mock object that returns a BytesIO() containing sample data as a substitute.

An important but subtle facet of this test is the patching of example.urlopen instead of urllib.request.urlopen. When you are making patches, you have to use the names as they are used in the code being tested. Since the example code uses from urllib.request import urlopen, the urlopen() function used by the dowprices() function is actually located in example.

This recipe has really only given a very small taste of what’s possible with the unittest.mock module. The official documentation is a must-read for more advanced features.

The assertRaises() method provides a convenient way to test for the presence of an exception. A common pitfall is to write tests that manually try to do things with exceptions on their own. For instance:

class TestConversion(unittest.TestCase):
    def test_bad_int(self):
        try:
            r = parse_int('N/A')
        except ValueError as e:
            self.assertEqual(type(e), ValueError)

The problem with such approaches is that it is easy to forget about corner cases, such as that when no exception is raised at all. To do that, you need to add an extra check for that situation, as shown here:

class TestConversion(unittest.TestCase):
    def test_bad_int(self):
        try:
            r = parse_int('N/A')
        except ValueError as e:
            self.assertEqual(type(e), ValueError)
        else:
            self.fail('ValueError not raised')

The assertRaises() method simply takes care of these details, so you should prefer to use it.

The one limitation of assertRaises() is that it doesn’t provide a means for testing the value of the exception object that’s created. To do that, you have to manually test it, as shown. Somewhere in between these two extremes, you might consider using the assertRaisesRegex() method, which allows you to test for an exception and perform a regular expression match against the exception’s string representation at the same time. For example:

class TestConversion(unittest.TestCase):
    def test_bad_int(self):
        self.assertRaisesRegex(ValueError, 'invalid literal .*',
                                       parse_int, 'N/A')

A little-known fact about assertRaises() and assertRaisesRegex() is that they can also be used as context managers:

class TestConversion(unittest.TestCase):
    def test_bad_int(self):
        with self.assertRaisesRegex(ValueError, 'invalid literal .*'):
            r = parse_int('N/A')

This form can be useful if your test involves multiple steps (e.g., setup) besides that of simply executing a callable.

The interesting thing about this recipe is not so much the task of getting test results redirected to a file, but the fact that doing so exposes some notable inner workings of the unittest module.

At a basic level, the unittest module works by first assembling a test suite. This test suite consists of the different testing methods you defined. Once the suite has been assembled, the tests it contains are executed.

These two parts of unit testing are separate from each other. The unittest.TestLoader instance created in the solution is used to assemble a test suite. The loadTestsFromModule() is one of several methods it defines to gather tests. In this case, it scans a module for TestCase classes and extracts test methods from them. If you want something more fine-grained, the loadTestsFromTestCase() method (not shown) can be used to pull test methods from an individual class that inherits from TestCase.

The TextTestRunner class is an example of a test runner class. The main purpose of this class is to execute the tests contained in a test suite. This class is the same test runner that sits behind the unittest.main() function. However, here we’re giving it a bit of low-level configuration, including an output file and an elevated verbosity level.

Although this recipe only consists of a few lines of code, it gives a hint as to how you might further customize the unittest framework. To customize how test suites are assembled, you would perform various operations using the TestLoader class. To customize how tests execute, you could make custom test runner classes that emulate the functionality of TextTestRunner. Both topics are beyond the scope of what can be covered here. However, documentation for the unittest module has extensive coverage of the underlying protocols.

Although it’s not specific to handling multiple exceptions per se, it’s worth noting that you can get a handle to the thrown exception using the as keyword:

try:
    f = open(filename)
except OSError as e:
    if e.errno == errno.ENOENT:
        logger.error('File not found')
    elif e.errno == errno.EACCES:
        logger.error('Permission denied')
    else:
        logger.error('Unexpected error: %d', e.errno)

In this example, the e variable holds an instance of the raised OSError. This is useful if you need to inspect the exception further, such as processing it based on the value of an additional status code.

Be aware that except clauses are checked in the order listed and that the first match executes. It may be a bit pathological, but you can easily create situations where multiple except clauses might match. For example:

>>> f = open('missing')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
FileNotFoundError: [Errno 2] No such file or directory: 'missing'
>>> try:
...     f = open('missing')
... except OSError:
...     print('It failed')
... except FileNotFoundError:
...     print('File not found')
...
It failed
>>>

Here the except FileNotFoundError clause doesn’t execute because the OSError is more general, matches the FileNotFoundError exception, and was listed first.

As a debugging tip, if you’re not entirely sure about the class hierarchy of a particular exception, you can quickly view it by inspecting the exception’s __mro__ attribute. For example:

>>> FileNotFoundError.__mro__
(<class 'FileNotFoundError'>, <class 'OSError'>, <class 'Exception'>,
 <class 'BaseException'>, <class 'object'>)
>>>

Any one of the listed classes up to BaseException can be used with the except statement.

Custom exception classes should almost always inherit from the built-in Exception class, or inherit from some locally defined base exception that itself inherits from Exception. Although all exceptions also derive from BaseException, you should not use this as a base class for new exceptions. BaseException is reserved for system-exiting exceptions, such as KeyboardInterrupt or SystemExit, and other exceptions that should signal the application to exit. Therefore, catching these exceptions is not the intended use case. Assuming you follow this convention, it follows that inheriting from BaseException causes your custom exceptions to not be caught and to signal an imminent application shutdown!

Having custom exceptions in your application and using them as shown makes your application code tell a more coherent story to whoever may need to read the code. One design consideration involves the grouping of custom exceptions via inheritance. In complicated applications, it may make sense to introduce further base classes that group different classes of exceptions together. This gives the user a choice of catching a narrowly specified error, such as this:

try:
    s.send(msg)
except ProtocolError:
    ...

It also gives the ability to catch a broad range of errors, such as the following:

try:
    s.send(msg)
except NetworkError:
    ...

If you are going to define a new exception that overrides the __init__() method of Exception, make sure you always call Exception.__init__() with all of the passed arguments. For example:

class CustomError(Exception):
    def __init__(self, message, status):
        super().__init__(message, status)
        self.message = message
        self.status = status

This might look a little weird, but the default behavior of Exception is to accept all arguments passed and to store them in the .args attribute as a tuple. Various other libraries and parts of Python expect all exceptions to have the .args attribute, so if you skip this step, you might find that your new exception doesn’t behave quite right in certain contexts. To illustrate the use of .args, consider this interactive session with the built-in RuntimeError exception, and notice how any number of arguments can be used with the raise statement:

>>> try:
...     raise RuntimeError('It failed')
... except RuntimeError as e:
...     print(e.args)
...
('It failed',)
>>> try:
...     raise RuntimeError('It failed', 42, 'spam')
... except RuntimeError as e:
...     print(e.args)
...
('It failed', 42, 'spam')
>>>

For more information on creating your own exceptions, see the Python documentation.

14.9. Raising an Exception in Response to Another Exception

To chain exceptions, use the raise from statement instead of a simple raise statement. This will give you information about both errors. For example:

>>> def example():
...     try:
...             int('N/A')
...     except ValueError as e:
...             raise RuntimeError('A parsing error occurred') from e
...
>>> example()
Traceback (most recent call last):
  File "<stdin>", line 3, in example
ValueError: invalid literal for int() with base 10: 'N/A'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 5, in example
RuntimeError: A parsing error occurred
>>>

As you can see in the traceback, both exceptions are captured. To catch such an exception, you would use a normal except statement. However, you can look at the __cause__ attribute of the exception object to follow the exception chain should you wish. For example:

try:
    example()
except RuntimeError as e:
    print("It didn't work:", e)
    if e.__cause__:
        print('Cause:', e.__cause__)

An implicit form of chained exceptions occurs when another exception gets raised inside an except block. For example:

>>> def example2():
...     try:
...             int('N/A')
...     except ValueError as e:
...             print("Couldn't parse:", err)
...
>>>
>>> example2()
Traceback (most recent call last):
  File "<stdin>", line 3, in example2
ValueError: invalid literal for int() with base 10: 'N/A'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 5, in example2
NameError: global name 'err' is not defined
>>>

In this example, you get information about both exceptions, but the interpretation is a bit different. In this case, the NameError exception is raised as the result of a programming error, not in direct response to the parsing error. For this case, the __cause__ attribute of an exception is not set. Instead, a __context__ attribute is set to the prior exception.

If, for some reason, you want to suppress chaining, use raise from None:

>>> def example3():
...     try:
...             int('N/A')
...     except ValueError:
...             raise RuntimeError('A parsing error occurred') from None
...
>>> example3()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 5, in example3
RuntimeError: A parsing error occurred
>>>

Issuing a warning message is often a useful technique for maintaining software and assisting users with issues that don’t necessarily rise to the level of being a full-fledged exception. For example, if you’re going to change the behavior of a library or framework, you can start issuing warning messages for the parts that you’re going to change while still providing backward compatibility for a time. You can also warn users about problematic usage issues in their code.

As another example of a warning in the built-in library, here is an example of a warning message generated by destroying a file without closing it:

>>> import warnings
>>> warnings.simplefilter('always')
>>> f = open('/etc/passwd')
>>> del f
__main__:1: ResourceWarning: unclosed file <_io.TextIOWrapper name='/etc/passwd'
 mode='r' encoding='UTF-8'>
>>>

By default, not all warning messages appear. The -W option to Python can control the output of warning messages. -W all will output all warning messages, -W ignore ignores all warnings, and -W error turns warnings into exceptions. As an alternative, you can can use the warnings.simplefilter() function to control output, as just shown. An argument of always makes all warning messages appear, ignore ignores all warnings, and error turns warnings into exceptions.

For simple cases, this is all you really need to issue warning messages. The warnings module provides a variety of more advanced configuration options related to the filtering and handling of warning messages. See the Python documentation for more information.

If your program is crashing with an exception, running your program as python3 -i someprogram.py can be a useful tool for simply looking around. The -i option starts an interactive shell as soon as a program terminates. From there, you can explore the environment. For example, suppose you have this code:

# sample.py

def func(n):
    return n + 10

func('Hello')

Running python3 -i produces the following:

bash % python3 -i sample.py
Traceback (most recent call last):
  File "sample.py", line 6, in <module>
    func('Hello')
  File "sample.py", line 4, in func
    return n + 10
TypeError: Can't convert 'int' object to str implicitly
>>> func(10)
20
>>>

If you don’t see anything obvious, a further step is to launch the Python debugger after a crash. For example:

>>> import pdb
>>> pdb.pm()
> sample.py(4)func()
-> return n + 10
(Pdb) w
  sample.py(6)<module>()
-> func('Hello')
> sample.py(4)func()
-> return n + 10
(Pdb) print n
'Hello'
(Pdb) q
>>>

If your code is deeply buried in an environment where it is difficult to obtain an interactive shell (e.g., in a server), you can often catch errors and produce tracebacks yourself. For example:

import traceback
import sys

try:
    func(arg)
except:
    print('**** AN ERROR OCCURRED ****')
    traceback.print_exc(file=sys.stderr)

If your program isn’t crashing, but it’s producing wrong answers or you’re mystified by how it works, there is often nothing wrong with just injecting a few print() calls in places of interest. However, if you’re going to do that, there are a few related techniques of interest. First, the traceback.print_stack() function will create a stack track of your program immediately at that point. For example:

>>> def sample(n):
...     if n > 0:
...             sample(n-1)
...     else:
...             traceback.print_stack(file=sys.stderr)
...
>>> sample(5)
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in sample
  File "<stdin>", line 3, in sample
  File "<stdin>", line 3, in sample
  File "<stdin>", line 3, in sample
  File "<stdin>", line 3, in sample
  File "<stdin>", line 5, in sample
>>>

Alternatively, you can also manually launch the debugger at any point in your program using pdb.set_trace() like this:

import pdb

def func(arg):
    ...
    pdb.set_trace()
    ...

This can be a useful technique for poking around in the internals of a large program and answering questions about the control flow or arguments to functions. For instance, once the debugger starts, you can inspect variables using print or type a command such as w to get the stack traceback.

If you simply want to time your whole program, it’s usually easy enough to use something like the Unix time command. For example:

bash % time python3 someprogram.py
real 0m13.937s
user 0m12.162s
sys  0m0.098s
bash %

On the other extreme, if you want a detailed report showing what your program is doing, you can use the cProfile module:

bash % python3 -m cProfile someprogram.py
         859647 function calls in 16.016 CPU seconds

   Ordered by: standard name


   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   263169    0.080    0.000    0.080    0.000 someprogram.py:16(frange)
      513    0.001    0.000    0.002    0.000 someprogram.py:30(generate_mandel)
   262656    0.194    0.000   15.295    0.000 someprogram.py:32(<genexpr>)
        1    0.036    0.036   16.077   16.077 someprogram.py:4(<module>)
   262144   15.021    0.000   15.021    0.000 someprogram.py:4(in_mandelbrot)
        1    0.000    0.000    0.000    0.000 os.py:746(urandom)
        1    0.000    0.000    0.000    0.000 png.py:1056(_readable)
        1    0.000    0.000    0.000    0.000 png.py:1073(Reader)
        1    0.227    0.227    0.438    0.438 png.py:163(<module>)
      512    0.010    0.000    0.010    0.000 png.py:200(group)
    ...
bash %

More often than not, profiling your code lies somewhere in between these two extremes. For example, you may already know that your code spends most of its time in a few selected functions. For selected profiling of functions, a short decorator can be useful. For example:

# timethis.py

import time
from functools import wraps

def timethis(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        start = time.perf_counter()
        r = func(*args, **kwargs)
        end = time.perf_counter()
        print('{}.{} : {}'.format(func.__module__, func.__name__, end - start))
        return r
    return wrapper

To use this decorator, you simply place it in front of a function definition to get timings from it. For example:

>>> @timethis
... def countdown(n):
...     while n > 0:
...             n -= 1
...
>>> countdown(10000000)
__main__.countdown : 0.803001880645752
>>>

To time a block of statements, you can define a context manager. For example:

from contextlib import contextmanager

@contextmanager
def timeblock(label):
    start = time.perf_counter()
    try:
        yield
    finally:
        end = time.perf_counter()
        print('{} : {}'.format(label, end - start))

Here is an example of how the context manager works:

>>> with timeblock('counting'):
...     n = 10000000
...     while n > 0:
...             n -= 1
...
counting : 1.5551159381866455
>>>

For studying the performance of small code fragments, the timeit module can be useful. For example:

>>> from timeit import timeit
>>> timeit('math.sqrt(2)', 'import math')
0.1432319980012835
>>> timeit('sqrt(2)', 'from math import sqrt')
0.10836604500218527
>>>

timeit works by executing the statement specified in the first argument a million times and measuring the time. The second argument is a setup string that is executed to set up the environment prior to running the test. If you need to change the number of iterations, supply a number argument like this:

>>> timeit('math.sqrt(2)', 'import math', number=10000000)
1.434852126003534
>>> timeit('sqrt(2)', 'from math import sqrt', number=10000000)
1.0270336690009572
>>>

14.14. Making Your Programs Run Faster

While the first rule of optimization might be to “not do it,” the second rule is almost certainly “don’t optimize the unimportant.” To that end, if your program is running slow, you might start by profiling your code as discussed in Recipe 14.13.

More often than not, you’ll find that your program spends its time in a few hotspots, such as inner data processing loops. Once you’ve identified those locations, you can use the no-nonsense techniques presented in the following sections to make your program run faster.

Use functions

A lot of programmers start using Python as a language for writing simple scripts. When writing scripts, it is easy to fall into a practice of simply writing code with very little structure. For example:

# somescript.py

import sys
import csv

with open(sys.argv[1]) as f:
     for row in csv.reader(f):
         # Some kind of processing
         ...

A little-known fact is that code defined in the global scope like this runs slower than code defined in a function. The speed difference has to do with the implementation of local versus global variables (operations involving locals are faster). So, if you want to make the program run faster, simply put the scripting statements in a function:

# somescript.py
import sys
import csv

def main(filename):
    with open(filename) as f:
         for row in csv.reader(f):
             # Some kind of processing
             ...

main(sys.argv[1])

The speed difference depends heavily on the processing being performed, but in our experience, speedups of 15-30% are not uncommon.

Before optimizing, it’s usually worthwhile to study the algorithms that you’re using first. You’ll get a much bigger speedup by switching to an O(n log n) algorithm than by trying to tweak the implementation of an O(n**2) algorithm.

If you’ve decided that you still must optimize, it pays to consider the big picture. As a general rule, you don’t want to apply optimizations to every part of your program, because such changes are going to make the code hard to read and understand. Instead, focus only on known performance bottlenecks, such as inner loops.

You need to be especially wary interpreting the results of micro-optimizations. For example, consider these two techniques for creating a dictionary:

a = {
    'name' : 'AAPL',
    'shares' : 100,
    'price' : 534.22
}

b = dict(name='AAPL', shares=100, price=534.22)

The latter choice has the benefit of less typing (you don’t need to quote the key names). However, if you put the two code fragments in a head-to-head performance battle, you’ll find that using dict() runs three times slower! With this knowledge, you might be inclined to scan your code and replace every use of dict() with its more verbose alternative. However, a smart programmer will only focus on parts of a program where it might actually matter, such as an inner loop. In other places, the speed difference just isn’t going to matter at all.

If, on the other hand, your performance needs go far beyond the simple techniques in this recipe, you might investigate the use of tools based on just-in-time (JIT) compilation techniques. For example, the PyPy project is an alternate implementation of the Python interpreter that analyzes the execution of your program and generates native machine code for frequently executed parts. It can sometimes make Python programs run an order of magnitude faster, often approaching (or even exceeding) the speed of code written in C. Unfortunately, as of this writing, PyPy does not yet fully support Python 3. So, that is something to look for in the future. You might also consider the Numba project. Numba is a dynamic compiler where you annotate selected Python functions that you want to optimize with a decorator. Those functions are then compiled into native machine code through the use of LLVM. It too can produce signficant performance gains. However, like PyPy, support for Python 3 should be viewed as somewhat experimental.

Last, but not least, the words of John Ousterhout come to mind: “The best performance improvement is the transition from the nonworking to the working state.” Don’t worry about optimization until you need to. Making sure your program works correctly is usually more important than making it run fast (at least initially).