+++++++++++++++++++++++++++++++
Writing Python Regression Tests
+++++++++++++++++++++++++++++++

:Author: Skip Montanaro
:Contact: skip@mojam.com

Introduction
============

If you add a new module to Python or modify the functionality of an existing
module, you should write one or more test cases to exercise that new
functionality.  There are different ways to do this within the regression
testing facility provided with Python; any particular test should use only
one of these options.  Each option requires writing a test module using the
conventions of the selected option:

    - PyUnit_ based tests
    - doctest_ based tests
    - "traditional" Python test modules

Regardless of the mechanics of the testing approach you choose,
you will be writing unit tests (isolated tests of functions and objects
defined by the module) using white box techniques.  Unlike black box
testing, where you only have the external interfaces to guide your test case
writing, in white box testing you can see the code being tested and tailor
your test cases to exercise it more completely.  In particular, you will be
able to refer to the C and Python code in the CVS repository when writing
your regression test cases.

.. _PyUnit:
.. _unittest: http://www.python.org/doc/current/lib/module-unittest.html
.. _doctest: http://www.python.org/doc/current/lib/module-doctest.html

PyUnit based tests
------------------
The PyUnit_ framework is based on the ideas of unit testing as espoused
by Kent Beck and the `Extreme Programming`_ (XP) movement.  The specific
interface provided by the framework is tightly based on the JUnit_
Java implementation of Beck's original SmallTalk test framework.  Please
see the documentation of the unittest_ module for detailed information on
the interface and general guidelines on writing PyUnit based tests.

The test_support helper module provides two functions for use by
PyUnit based tests in the Python regression testing framework:

- ``run_unittest()`` takes a ``unittest.TestCase`` derived class as a
  parameter and runs the tests defined in that class
   
- ``run_suite()`` takes a populated ``TestSuite`` instance and runs the
  tests
   
``run_suite()`` is preferred because unittest files typically grow multiple
test classes, and you might as well be prepared.

All test methods in the Python regression framework have names that
start with "``test_``" and use lower-case names with words separated with
underscores.

Test methods should *not* have docstrings!  The unittest module prints
the docstring if there is one, but otherwise prints the function name
and the full class name.  When there's a problem with a test, the
latter information makes it easier to find the source for the test
than the docstring.

All PyUnit-based tests in the Python test suite use boilerplate that
looks like this (with minor variations)::

    import unittest
    from test import test_support

    class MyTestCase1(unittest.TestCase):

        # Define setUp and tearDown only if needed

        def setUp(self):
            unittest.TestCase.setUp(self)
            ... additional initialization...

        def tearDown(self):
            ... additional finalization...
            unittest.TestCase.tearDown(self)

        def test_feature_one(self):
            # Testing feature one
            ...unit test for feature one...

        def test_feature_two(self):
            # Testing feature two
            ...unit test for feature two...

        ...etc...

    class MyTestCase2(unittest.TestCase):
        ...same structure as MyTestCase1...

    ...etc...

    def test_main():
        suite = unittest.TestSuite()
        suite.addTest(unittest.makeSuite(MyTestCase1))
        suite.addTest(unittest.makeSuite(MyTestCase2))
        ...add more suites...
        test_support.run_suite(suite)

    if __name__ == "__main__":
        test_main()

This has the advantage that it allows the unittest module to be used
as a script to run individual tests as well as working well with the
regrtest framework.

.. _Extreme Programming: http://www.extremeprogramming.org/
.. _JUnit: http://www.junit.org/

doctest based tests
-------------------
Tests written to use doctest_ are actually part of the docstrings for
the module being tested.  Each test is written as a display of an
interactive session, including the Python prompts, statements that would
be typed by the user, and the output of those statements (including
tracebacks, although only the exception msg needs to be retained then).
The module in the test package is simply a wrapper that causes doctest
to run over the tests in the module.  The test for the difflib module
provides a convenient example::

    import difflib
    from test import test_support
    test_support.run_doctest(difflib)

If the test is successful, nothing is written to stdout (so you should not
create a corresponding output/test_difflib file), but running regrtest
with -v will give a detailed report, the same as if passing -v to doctest.

A second argument can be passed to run_doctest to tell doctest to search
``sys.argv`` for -v instead of using test_support's idea of verbosity.  This
is useful for writing doctest-based tests that aren't simply running a
doctest'ed Lib module, but contain the doctests themselves.  Then at
times you may want to run such a test directly as a doctest, independent
of the regrtest framework.  The tail end of test_descrtut.py is a good
example::

    def test_main(verbose=None):
        from test import test_support, test_descrtut
        test_support.run_doctest(test_descrtut, verbose)

    if __name__ == "__main__":
        test_main(1)

If run via regrtest, ``test_main()`` is called (by regrtest) without
specifying verbose, and then test_support's idea of verbosity is used.  But
when run directly, ``test_main(1)`` is called, and then doctest's idea of
verbosity is used.

See the documentation for the doctest module for information on
writing tests using the doctest framework.

"traditional" Python test modules
---------------------------------
The mechanics of how the "traditional" test system operates are fairly
straightforward.  When a test case is run, the output is compared with the
expected output that is stored in .../Lib/test/output.  If the test runs to
completion and the actual and expected outputs match, the test succeeds, if
not, it fails.  If an ``ImportError`` or ``test_support.TestSkipped`` error
is raised, the test is not run.

Executing Test Cases
====================
If you are writing test cases for module spam, you need to create a file
in .../Lib/test named test_spam.py.  In addition, if the tests are expected
to write to stdout during a successful run, you also need to create an
expected output file in .../Lib/test/output named test_spam ("..."
represents the top-level directory in the Python source tree, the directory
containing the configure script).  If needed, generate the initial version
of the test output file by executing::

    ./python Lib/test/regrtest.py -g test_spam.py

from the top-level directory.

Any time you modify test_spam.py you need to generate a new expected
output file.  Don't forget to desk check the generated output to make sure
it's really what you expected to find!  All in all it's usually better
not to have an expected-out file (note that doctest- and unittest-based
tests do not).

To run a single test after modifying a module, simply run regrtest.py
without the -g flag::

    ./python Lib/test/regrtest.py test_spam.py

While debugging a regression test, you can of course execute it
independently of the regression testing framework and see what it prints::

    ./python Lib/test/test_spam.py

To run the entire test suite:

- [UNIX, + other platforms where "make" works] Make the "test" target at the
  top level::

    make test

- [WINDOWS] Run rt.bat from your PCBuild directory.  Read the comments at
  the top of rt.bat for the use of special -d, -O and -q options processed
  by rt.bat.

- [OTHER] You can simply execute the two runs of regrtest (optimized and
  non-optimized) directly::

    ./python Lib/test/regrtest.py
    ./python -O Lib/test/regrtest.py

But note that this way picks up whatever .pyc and .pyo files happen to be
around.  The makefile and rt.bat ways run the tests twice, the first time
removing all .pyc and .pyo files from the subtree rooted at Lib/.

Test cases generate output based upon values computed by the test code.
When executed, regrtest.py compares the actual output generated by executing
the test case with the expected output and reports success or failure.  It
stands to reason that if the actual and expected outputs are to match, they
must not contain any machine dependencies.  This means your test cases
should not print out absolute machine addresses (e.g. the return value of
the id() builtin function) or floating point numbers with large numbers of
significant digits (unless you understand what you are doing!).


Test Case Writing Tips
======================
Writing good test cases is a skilled task and is too complex to discuss in
detail in this short document.  Many books have been written on the subject.
I'll show my age by suggesting that Glenford Myers' `"The Art of Software
Testing"`_, published in 1979, is still the best introduction to the subject
available.  It is short (177 pages), easy to read, and discusses the major
elements of software testing, though its publication predates the
object-oriented software revolution, so doesn't cover that subject at all.
Unfortunately, it is very expensive (about $100 new).  If you can borrow it
or find it used (around $20), I strongly urge you to pick up a copy.

The most important goal when writing test cases is to break things.  A test
case that doesn't uncover a bug is much less valuable than one that does.
In designing test cases you should pay attention to the following:

    * Your test cases should exercise all the functions and objects defined
      in the module, not just the ones meant to be called by users of your
      module.  This may require you to write test code that uses the module
      in ways you don't expect (explicitly calling internal functions, for
      example - see test_atexit.py).

    * You should consider any boundary values that may tickle exceptional
      conditions (e.g. if you were writing regression tests for division,
      you might well want to generate tests with numerators and denominators
      at the limits of floating point and integer numbers on the machine
      performing the tests as well as a denominator of zero).

    * You should exercise as many paths through the code as possible.  This
      may not always be possible, but is a goal to strive for.  In
      particular, when considering if statements (or their equivalent), you
      want to create test cases that exercise both the true and false
      branches.  For loops, you should create test cases that exercise the
      loop zero, one and multiple times.

    * You should test with obviously invalid input.  If you know that a
      function requires an integer input, try calling it with other types of
      objects to see how it responds.

    * You should test with obviously out-of-range input.  If the domain of a
      function is only defined for positive integers, try calling it with a
      negative integer.

    * If you are going to fix a bug that wasn't uncovered by an existing
      test, try to write a test case that exposes the bug (preferably before
      fixing it).

    * If you need to create a temporary file, you can use the filename in
      ``test_support.TESTFN`` to do so.  It is important to remove the file
      when done; other tests should be able to use the name without cleaning
      up after your test.

.. _"The Art of Software Testing": 
        http://www.amazon.com/exec/obidos/ISBN=0471043281

Regression Test Writing Rules
=============================
Each test case is different.  There is no "standard" form for a Python
regression test case, though there are some general rules (note that
these mostly apply only to the "classic" tests; unittest_- and doctest_-
based tests should follow the conventions natural to those frameworks)::

    * If your test case detects a failure, raise ``TestFailed`` (found in
      ``test.test_support``).

    * Import everything you'll need as early as possible.

    * If you'll be importing objects from a module that is at least
      partially platform-dependent, only import those objects you need for
      the current test case to avoid spurious ``ImportError`` exceptions
      that prevent the test from running to completion.

    * Print all your test case results using the ``print`` statement.  For
      non-fatal errors, print an error message (or omit a successful
      completion print) to indicate the failure, but proceed instead of
      raising ``TestFailed``.

    * Use ``assert`` sparingly, if at all.  It's usually better to just print
      what you got, and rely on regrtest's got-vs-expected comparison to
      catch deviations from what you expect.  ``assert`` statements aren't
      executed at all when regrtest is run in -O mode; and, because they
      cause the test to stop immediately, can lead to a long & tedious
      test-fix, test-fix, test-fix, ... cycle when things are badly broken
      (and note that "badly broken" often includes running the test suite
      for the first time on new platforms or under new implementations of
      the language).

Miscellaneous
=============
There is a test_support module in the test package you can import for
your test case.  Import this module using either::

    import test.test_support

or::

    from test import test_support

test_support provides the following useful objects:

    * ``TestFailed`` - raise this exception when your regression test detects
      a failure.

    * ``TestSkipped`` - raise this if the test could not be run because the
      platform doesn't offer all the required facilities (like large
      file support), even if all the required modules are available.

    * ``ResourceDenied`` - this is raised when a test requires a resource that
      is not available.  Primarily used by 'requires'.

    * ``verbose`` - you can use this variable to control print output.  Many
      modules use it.  Search for "verbose" in the test_*.py files to see
      lots of examples.

    * ``forget(module_name)`` - attempts to cause Python to "forget" that it
      loaded a module and erase any PYC files.

    * ``is_resource_enabled(resource)`` - Returns a boolean based on whether
      the resource is enabled or not.

    * ``requires(resource [, msg])`` - if the required resource is not
      available the ResourceDenied exception is raised.
    
    * ``verify(condition, reason='test failed')``.  Use this instead of::

          assert condition[, reason]

      ``verify()`` has two advantages over ``assert``:  it works even in -O
      mode, and it raises ``TestFailed`` on failure instead of
      ``AssertionError``.

    * ``have_unicode`` - true if Unicode is available, false otherwise.

    * ``is_jython`` - true if the interpreter is Jython, false otherwise.

    * ``TESTFN`` - a string that should always be used as the filename when
      you need to create a temp file.  Also use ``try``/``finally`` to
      ensure that your temp files are deleted before your test completes.
      Note that you cannot unlink an open file on all operating systems, so
      also be sure to close temp files before trying to unlink them.

    * ``sortdict(dict)`` - acts like ``repr(dict.items())``, but sorts the
      items first.  This is important when printing a dict value, because
      the order of items produced by ``dict.items()`` is not defined by the
      language.

    * ``findfile(file)`` - you can call this function to locate a file
      somewhere along sys.path or in the Lib/test tree - see
      test_linuxaudiodev.py for an example of its use.

    * ``fcmp(x,y)`` - you can call this function to compare two floating
      point numbers when you expect them to only be approximately equal
      withing a fuzz factor (``test_support.FUZZ``, which defaults to 1e-6).

    * ``check_syntax(statement)`` - make sure that the statement is *not*
      correct Python syntax.


Python and C statement coverage results are currently available at

    http://www.musi-cal.com/~skip/python/Python/dist/src/

As of this writing (July, 2000) these results are being generated nightly.
You can refer to the summaries and the test coverage output files to see
where coverage is adequate or lacking and write test cases to beef up the
coverage.

Some Non-Obvious regrtest Features
==================================
    * Automagic test detection:  When you create a new test file
      test_spam.py, you do not need to modify regrtest (or anything else)
      to advertise its existence.  regrtest searches for and runs all
      modules in the test directory with names of the form test_xxx.py.

    * Miranda output:  If, when running test_spam.py, regrtest does not
      find an expected-output file test/output/test_spam, regrtest
      pretends that it did find one, containing the single line

      test_spam

      This allows new tests that don't expect to print anything to stdout
      to not bother creating expected-output files.

    * Two-stage testing:  To run test_spam.py, regrtest imports test_spam
      as a module.  Most tests run to completion as a side-effect of
      getting imported.  After importing test_spam, regrtest also executes
      ``test_spam.test_main()``, if test_spam has a ``test_main`` attribute.
      This is rarely required with the "traditional" Python tests, and
      you shouldn't create a module global with name test_main unless
      you're specifically exploiting this gimmick.  This usage does
      prove useful with PyUnit-based tests as well, however; defining
      a ``test_main()`` which is run by regrtest and a script-stub in the
      test module ("``if __name__ == '__main__': test_main()``") allows
      the test to be used like any other Python test and also work
      with the unittest.py-as-a-script approach, allowing a developer
      to run specific tests from the command line.