[TIP] Testing a daemon with coverage (nose)

Sat May 9 06:15:39 PDT 2009

Laura Creighton wrote:
>> Marius Gedminas <marius at gedmin.as> writes:
>>
>>     
>>> I've got a 135000-line application (not counting size of 3rd-party
>>> libraries we rely on) with 2753 unit tests. The tests use stubs where
>>> convenient and don't rely on external resources. The unit test suite
>>> takes 8 and a half minutes to run, not counting module imports
>>> (additional 40 seconds) from cold cache on a 1.8 GHz Core 2 Duo CPU.
>>> When everything's in disk cache the import time drops to 6 seconds,
>>> but the test run is still over 8 minutes.
>>>       
>> That's quite a large amount of code to be treating all as a single
>> homogeneous code base. Attempting to treat it all as a big lump called
>> =E2=80=9Cthe application=E2=80=9D seems like a design smell that needs ad
>> dr=
>> essing.
>>
>> Such a large amount would prompt me to try to find natural
>> coarse-grained sections, to be de-coupled into libraries to the rest of
>> the system and have well-defined interfaces. That would mean that at any
>> time you're developing on a significantly smaller code base, with a
>> correspondingly smaller set of interdependent code and smaller suite of
>> unit tests.
>>
>> Ben Finney
>> --=-=-=--
>>     
>
> Remember that this subthread started when you asked why anybody would
> be interersted in py.test --looponfailing behaviour. Py.test was made
> in conjunction with py.py to test our python compiler, which was, in
> its very early days, about 10,000 times slower than CPython.  We have
> lots of ways to specify 'run a subset of all the tests', but no
> particularly good way to make our test running faster. After all, that
> is the whole purpose of the project -- to produce a a faster running
> python.  We had to start with a painfully slow one.  And there is no
> particular use in testing half of a garbage collector.  we are limited
> in how we can break up what we test.
>
> It is interesting that you want tofind out if a new piece of code
> broke an existing test as soon as possible.  I don't.  Because in the
> world that I work in, when I run a new test run, and some old test
> that I wasn't expecting to break, breaks, the most common reason is --
> somebody else made a checkin which broke that test.  So I shouldn't
> waste any any time seeing if I am the one that broke that test until I
> am done fixing the failing tests which I am sure are my fault.  The
> cost to me of finding out whether I broke the test (by just readng the
> code and the error) is high relative to me just expecting the failure
> to go away without any action on my part.
>
> Sometimes I would have saved myself some time if I had discovered and
> fixed code that broke an old test before I went and fixed the tests
> that I was expecting to fail, and 'knew' were mine .. but mostly it
> doesn't make a difference if you fix the one you know are yours first.
>   
It seems to me there is a synthesis to be had here.  If we had ways to:
  (A) Finish a test run gracefully after the current test.
  (B) Re-order tests placing previous failures at the front.
  (C) Signal when tests ordered by (B) passes into "formerly good."
It would address all of these uses.

(A) is not a "big red button" shutdown.  It is smooth stop so that we
get a full report report issued on the (possibly partial) run.  (B)
could be part of a suite of test-ordering possibilities (random,
source order, failure probability, ...).  (C) could be accomplished
by inserting a pseudo-test into the stream that did the signaling.

--Scott David Daniels
Scott.Daniels at Acm.Org