[TIP] Randomizing test order with nose

Thu Apr 23 08:32:33 PDT 2009

Douglas Philips wrote:
 > On or about 2009 Apr 22, at 6:36 PM, Scott David Daniels indited:
 >> There, the simple expedient of subclassing Random by attaching a 
 >> counter to the root pseudo-random number generator meant that you
 >> could characterize your position in the random number sequence as
 >> (_seed, _calls).
 > ... in previous versions of our attempts to make the PRNG reproducible
 > we used a counter (and manually  called random() to catch up). It was
 > very tricky to make sure everyone  had pulled in the right version of
 > the instrumented random.
 > Had we planned for that up front we could have forced the bottleneck
 > necessary. Instead we've found it was just must simpler to seed the
 > PRNG to captured and known value, when needed....

OK, I agree the choke point is a good idea.  However, I ran off
and read the random module, and if you can accept a large state,
you can certainly use .getstate() and .setstate().  The states are
large, so you really want to store them somewhere, rather than retype
them (the advantage of the counter -- it can actually be retyped).

We might ask for a couple of methods on Random: _ss_ = .get_small_state()
and .set_small_state(_ss_), or even .setstate(_ss) where the consideration
is not speed of extraction/setting, but compactness of state.  Such a method
might work as the counter does, but would allow the module to do other 
things
like a mixed counter systems which counts block-recalcs and advances-within-
block. 

I am thinking this use case has not been considered for the random module;
perhaps I should bring it up on python-dev.  If anyone else thinks so, 
let me
know and I'll bring it up there.  Otherwise I'll figure (as I have before)
that it is just my itch, and I might as well scratch it myself.

 > Test reproducibility is crucial when debugging large systems. I'm not 
 > sure it is of much, if any, use for strict unit-test scope testing.

As long as your unit tests do not random sample an input universe, you
are correct.  My former lead and I would argue about this for fun.  Of
course, you and he are both wrong, and I am right :-).  I know, for example,
that Xerox distributes its employee directory under control of a random
number generator, and, by that expedient, manages to keep the data more
effectively up to date.  Similarly, ethernet cannot effectively work 
without
randomization.

--Scott David Daniels
Scott.Daniels at Acm.Org