[TIP] parallel testing plugin for nose ... sprint at PyCon?

Tue Jan 8 13:26:55 PST 2008

Hello test-enthusiasts, QA Engineers, and the like ...
[sorry for the cross-list post]

I mentioned once before in a conversation with Noah Gift on TIP that I
might consider leading a sprint at PyCon to build a parallel testing
plugin for nose.  I still think this would be a fun sprint but I've
decided that I can't commit to organizing one due to the fact that I
am giving a talk [1], co-presenting a tutorial [2], and working
closely with my company to do some recruiting at PyCon.  Phew!  I also
may not be able to commit to all 4 days of sprinting due to work.

Is anyone interested in leading this sprint?  If so, I would be your
champion sprinter and would certainly share my ideas for the
implementation :)  As sprint projects are beginning to pop up and
PyCon is poised to give much better exposure to sprints this year, now
is the time:
http://us.pycon.org/2008/sprints/projects/

So why do we need a mechanism to run nosetests in parallel?  Jason
Pellerin and I work at a company where we now have at least 2
functional test suites that take over 3 hours each to run.  This is
because they run Selenium tests in a real browser; they would probably
benefit from optimization but building a plugin to, say, run those
tests across 3 machines simultaneously would be a HUGE timesaver and
probably a better pay-off than low-level optimization.  I have no
doubt that there are other people who are in similar situations --
that is, with tests suites already running in nose that could benefit
from parallel-ization.  There seems to be keen interest on the nose
list.

Here is the best idea I have so far for an implementation that has a
chance of fitting into a sprint:

- we need a supervisor.  i.e. nosetests --run-as-supervisor
- we need worker daemons that wait for commands.  i.e. nosetests
--run-as-worker --bind 192.168.50.1:9001, etc
- we need a mechanism for the supervisor to analyze a test suite and
split it into chunks for workers to run.
- we need a way for workers to post test results back to the
supervisor and for the supervisor to report on progress

  - my only idea for accomplishing this while still maintaining
setup/teardown calls at all levels (packages, modules, classes) is to
hook into nose's collector so that each test can be assigned an id.
For a test suite of 300 tests and a pool of 3 workers, each worker
would be told to run :

    @worker1: nosetests 1:100
    @worker2: nosetests 101:200
    @worker3: nosetests 201:300

- finally, we need an implementation!  we need tests!  we need
benchmarks!  we need docs!  et cetera

(For a long rambling trail of notes about the above plus what to
consider when writing the plugin:
http://code.google.com/p/python-nose/issues/detail?id=93&q=parallel )

Now, here's the part where I introduce a CRAZY idea.  Nose's humble
Godfather, py.test, seems to be hard at work on distributed testing
already -- https://codespeak.net/py/dist/test.html#automated-distributed-testing
.  I also noticed in the py.test talk description that they plan on
presenting details on using this feature.  If we have the same goals
it might just be possible for us nosers to help out with the py.test
implementation by making the nose plugin hook into its components.
But sometimes code is better rewritten than it is shared, so this is
just a thought.  At the end of the day, there are many people who have
large codebases that depend on nose and all the plugins it supports so
think porting those codebases to py.test would not be a realistic way
to accomplish testing in parallel.

[1] Unicode in Python, Demystified
[2] Secrets of the Framework Creators

- Kumar