[TIP] Column questions.

Andrew Dalke dalke at dalkescientific.com
Thu Jul 24 04:20:01 PDT 2008


Tim Head wrote:
> How to test code that involves random numbers. So far I have not found
> a good "howto" for this yet and curious to hear peoples thoughts on
> it.

Make sure you can specify the random number seed.  This helps
reduce or eliminate variability across tests.  All this tests
is reproducibility, not correctness.

If your monte carlo uses floats you'll find other problems.
Testing is hard.  The order of operations affects the results
(most often only in the last digit) and can change with different
versions of the compiler.  If you are doing distributed computing,
then it also can depend on the message order between the processors.

Some people use a rule-of-thumb like "reproducible to 0.001%" or
"only changes the last digit."  I've done that, but run into
problems because the methods I'm using are only good to 1%, so
when the algorithms change I have to track down if it's a bug,
an improvement/degradation, or simply a difference in opinion.

It's possible sometimes to do numerical analysis to figure out
the range of error.  I've not been in this circumstance.

For correctness you can look at macro properties, like checking
that energy/momentum are conserved over time.

Many dynamical systems are chaotic, so once you get a difference
that difference can grow.  I've heard of people doing things
like checking if the difference in phase space is described
by a Lyapunov exponent.  If it isn't then the differences
might be a coding error.

Don't forgot all of the other tools to find errors.  Just
because part of the code uses randoms doesn't mean the
entire code base is untestable.



Greg Wilson gave a presentation recently titled

    "High-Performance Computing Considered Harmful"

The slides are at
   http://www.cs.toronto.edu/~gvwilson/articles/hpc-considered- 
harmful-2008.pdf

The relevant one for this discussion is:

  o We don’t know how to test floating-point code
      * In a way that a grad student in civil engineering can
        understand and would willingly do
      * [Einarsson 2005] describes the problem several
        times, but doesn’t offer solutions
      * |(actual-expected)/expected)| < 10-6 is superstition,
        not science
   o The real grand challenge in scientific computing
      * At least, the one we ought to solve first

He also mentions a bit of this problem in this podcast interview,
which I found quite interesting

    http://itc.conversationsnetwork.org/shows/detail3682.html

				Andrew
				dalke at dalkescientific.com





More information about the testing-in-python mailing list