Writing Python Regression Tests ------------------------------- Skip Montanaro (skip@mojam.com) Introduction If you add a new module to Python or modify the functionality of an existing module, you should write one or more test cases to exercise that new functionality. The mechanics of how the test system operates are fairly straightforward. When a test case is run, the output is compared with the expected output that is stored in .../Lib/test/output. If the test runs to completion and the actual and expected outputs match, the test succeeds, if not, it fails. If an ImportError or test_support.TestSkipped error is raised, the test is not run. You will be writing unit tests (isolated tests of functions and objects defined by the module) using white box techniques. Unlike black box testing, where you only have the external interfaces to guide your test case writing, in white box testing you can see the code being tested and tailor your test cases to exercise it more completely. In particular, you will be able to refer to the C and Python code in the CVS repository when writing your regression test cases. Executing Test Cases If you are writing test cases for module spam, you need to create a file in .../Lib/test named test_spam.py and an expected output file in .../Lib/test/output named test_spam ("..." represents the top-level directory in the Python source tree, the directory containing the configure script). From the top-level directory, generate the initial version of the test output file by executing: ./python Lib/test/regrtest.py -g test_spam.py Any time you modify test_spam.py you need to generate a new expected output file. Don't forget to desk check the generated output to make sure it's really what you expected to find! To run a single test after modifying a module, simply run regrtest.py without the -g flag: ./python Lib/test/regrtest.py test_spam.py While debugging a regression test, you can of course execute it independently of the regression testing framework and see what it prints: ./python Lib/test/test_spam.py To run the entire test suite, make the "test" target at the top level: make test On non-Unix platforms where make may not be available, you can simply execute the two runs of regrtest (optimized and non-optimized) directly: ./python Lib/test/regrtest.py ./python -O Lib/test/regrtest.py Test cases generate output based upon values computed by the test code. When executed, regrtest.py compares the actual output generated by executing the test case with the expected output and reports success or failure. It stands to reason that if the actual and expected outputs are to match, they must not contain any machine dependencies. This means your test cases should not print out absolute machine addresses (e.g. the return value of the id() builtin function) or floating point numbers with large numbers of significant digits (unless you understand what you are doing!). Test Case Writing Tips Writing good test cases is a skilled task and is too complex to discuss in detail in this short document. Many books have been written on the subject. I'll show my age by suggesting that Glenford Myers' "The Art of Software Testing", published in 1979, is still the best introduction to the subject available. It is short (177 pages), easy to read, and discusses the major elements of software testing, though its publication predates the object-oriented software revolution, so doesn't cover that subject at all. Unfortunately, it is very expensive (about $100 new). If you can borrow it or find it used (around $20), I strongly urge you to pick up a copy. The most important goal when writing test cases is to break things. A test case that doesn't uncover a bug is much less valuable than one that does. In designing test cases you should pay attention to the following: * Your test cases should exercise all the functions and objects defined in the module, not just the ones meant to be called by users of your module. This may require you to write test code that uses the module in ways you don't expect (explicitly calling internal functions, for example - see test_atexit.py). * You should consider any boundary values that may tickle exceptional conditions (e.g. if you were writing regression tests for division, you might well want to generate tests with numerators and denominators at the limits of floating point and integer numbers on the machine performing the tests as well as a denominator of zero). * You should exercise as many paths through the code as possible. This may not always be possible, but is a goal to strive for. In particular, when considering if statements (or their equivalent), you want to create test cases that exercise both the true and false branches. For loops, you should create test cases that exercise the loop zero, one and multiple times. * You should test with obviously invalid input. If you know that a function requires an integer input, try calling it with other types of objects to see how it responds. * You should test with obviously out-of-range input. If the domain of a function is only defined for positive integers, try calling it with a negative integer. * If you are going to fix a bug that wasn't uncovered by an existing test, try to write a test case that exposes the bug (preferably before fixing it). * If you need to create a temporary file, you can use the filename in test_support.TESTFN to do so. It is important to remove the file when done; other tests should be able to use the name without cleaning up after your test. Regression Test Writing Rules Each test case is different. There is no "standard" form for a Python regression test case, though there are some general rules: * If your test case detects a failure, raise TestFailed (found in test_support). * Import everything you'll need as early as possible. * If you'll be importing objects from a module that is at least partially platform-dependent, only import those objects you need for the current test case to avoid spurious ImportError exceptions that prevent the test from running to completion. * Print all your test case results using the print statement. For non-fatal errors, print an error message (or omit a successful completion print) to indicate the failure, but proceed instead of raising TestFailed. * Use "assert" sparingly, if at all. It's usually better to just print what you got, and rely on regrtest's got-vs-expected comparison to catch deviations from what you expect. assert statements aren't executed at all when regrtest is run in -O mode; and, because they cause the test to stop immediately, can lead to a long & tedious test-fix, test-fix, test-fix, ... cycle when things are badly broken (and note that "badly broken" often includes running the test suite for the first time on new platforms or under new implementations of the language). Miscellaneous There is a test_support module you can import from your test case. It provides the following useful objects: * TestFailed - raise this exception when your regression test detects a failure. * TestSkipped - raise this if the test could not be run because the platform doesn't offer all the required facilities (like large file support), even if all the required modules are available. * findfile(file) - you can call this function to locate a file somewhere along sys.path or in the Lib/test tree - see test_linuxaudiodev.py for an example of its use. * verbose - you can use this variable to control print output. Many modules use it. Search for "verbose" in the test_*.py files to see lots of examples. * use_large_resources - true iff tests requiring large time or space should be run. * fcmp(x,y) - you can call this function to compare two floating point numbers when you expect them to only be approximately equal withing a fuzz factor (test_support.FUZZ, which defaults to 1e-6). NOTE: Always import something from test_support like so: from test_support import verbose or like so: import test_support ... use test_support.verbose in the code ... Never import anything from test_support like this: from test.test_support import verbose "test" is a package already, so can refer to modules it contains without "test." qualification. If you do an explicit "test.xxx" qualification, that can fool Python into believing test.xxx is a module distinct from the xxx in the current package, and you can end up importing two distinct copies of xxx. This is especially bad if xxx=test_support, as regrtest.py can (and routinely does) overwrite its "verbose" and "use_large_resources" attributes: if you get a second copy of test_support loaded, it may not have the same values for those as regrtest intended. Python and C statement coverage results are currently available at http://www.musi-cal.com/~skip/python/Python/dist/src/ As of this writing (July, 2000) these results are being generated nightly. You can refer to the summaries and the test coverage output files to see where coverage is adequate or lacking and write test cases to beef up the coverage.