[TIP] Test isolation // Detection of offending test

Thu Dec 6 04:23:42 PST 2012

Rob,

On Wed, Dec 5, 2012 at 9:55 PM, Robert Collins
<robertc at robertcollins.net> wrote:
> On Thu, Dec 6, 2012 at 9:55 AM, Benji York <benji at benjiyork.com> wrote:
>> On Wed, Dec 5, 2012 at 4:50 PM, Andres Riancho <andres.riancho at gmail.com> wrote:
>>> Lists,
>>>
>>>     I've got a project with 500+ tests, and during the last month or
>>> so I started to notice that some of my tests run perfectly if I run
>>> them directly (nosetests --config=nose.cfg
>>> core/data/url/tests/test_xurllib.py) but fail when I run them together
>>> with all the other tests (nosetests --config=nose.cfg core/).
>> [snip]
>>>     I suspect that this is a common issue when testing, how do you
>>> guys solve this?
>>
>> The way I have handled it is to use a test runner that can randomize
>> test order and use that option with a continuous integration system like
>> Buildbot or Jenkins.  It is especially nice if the test runner tells you
>> the seed it used for the random number generator so you can replicate
>> the test order yourself.
>
> Oh hai :).
>
> You can also (for a specific case) bisect, if you have a consistent
> run order, by running the last 1/2 leading to the failure. Then the
> last 1/4 etc until it fails, and when it fails, remove the last 1/N-1
> tests (keeping the one that breaks), and so on and so forth.

After sending the email to the mailing lists I continued to look into
the issue and someone over IRC recommended nose-bisect [0] which
basically does what you described, but in an automated manner. I've
found some bugs in the nose plugin though :S

[0] https://github.com/dcramer/nose-bisect

>>>     for test in all_tests:
>>>         result = run_nosetests(test, test_that_fails)
>>>         if result == False:
>>>             print test, 'breaks', test_that_fails
>>>
>>>     Is that a good idea?
>>
>> This is certainly a reasonable way to find the current bad actors, but
>> you will need something to keep the pressure on so new ones aren't
>> created.
>
> Yeah. Openstack is running through this pain at the moment, as they
> bring in testr based parallelisation - lots of brain damaged cruft
> turning up and getting fixed.
>
> Andres - if your tests don't use nose specific features, you could use
> testtools/subunit to do automation around this - "python -m
> testtools.run discover . --load-list FOO" will load a list of test
> id's from FOO and run it.

I'm using most of nose's features, so this is out of the picture for now,

> -Rob
>
> --
> Robert Collins <rbtcollins at hp.com>
> Distinguished Technologist
> HP Cloud Services

--
Andrés Riancho
Project Leader at w3af - http://w3af.org/
Web Application Attack and Audit Framework
Twitter: @w3af
GPG: 0x93C344F3