[TIP] massive parallel testing in python

Paul Dubois pfdubois at gmail.com
Fri Dec 7 11:47:44 PST 2007


Just to be clear, ATS is not a testing harness, that does things like
setup/teardown. It is a test suite manager, dispatching tests onto available
resources, using some intelligence about getting the 'bottleneck' tests done
first, filling up any parallel processor nodes, etc. It simply runs the job
you tell it, and gets back the report of whether or not it worked. It also
knows how to start parallel (if this is a machine it knows about) or
distributed jobs.

We used it extensively at LLNL so I'm pretty sure the quality is good; what
is not good is the packaging to enable nice addition of new machines. Also,
it knows how to run the LLNL batch system for tests that need more resources
(such as time) than it has -- but it won't know how to run YOUR batch system
(do you? (:->))

Creating this tool involved pain, things you never think of off hand. For
example, if one test depends on another, we knew to run the parent before
the child. Most of our file space was shared. But then we discovered some
people creating files on private, not shared, file systems, and expect the
child to pick up those files as input. Oops. Back we went, to make sure
dependent jobs were run in the same place as their parent. Also we adopted
some defensive measures to avoid having multiple variants of the same test
stomp on the same files at the same time. Yeah, we could blame the test, but
we thought we ought to help out.

So, if you do decide to work on it, make sure you don't take things out
because you don't know why they are there.

Also, write to me and I'll tell you who to contact about getting any later
sources; I retired six months ago and there is always the possibility that
they fixed something. It is possible they might be able to participate in a
sprint but I'm unsure about that. I am definitely not available..

On Dec 7, 2007 11:05 AM, Kumar McMillan <kumar.mcmillan at gmail.com> wrote:

> On Dec 5, 2007 10:44 AM, Noah Gift <noah.gift at gmail.com> wrote:
> > >
> > >
> > > The good news is it definitely shortens the execution wall time,
> > > depending
> > > on how many resources you let it have.
> >
> > Paul,
> >
> > ATS looks like very interesting stuff.  I will take a look at it this
> > week.  Thanks to everyone else, for the great suggestions.  On a
> > related note, I would love to be involved in a testing related Sprint
> > at PyCon.  I think they, the organizers, are still looking for Sprint
> > leaders, so if someone is interested in doing more work on parallel
> > testing in Python, I will show up.
>
> this is a great idea for a sprint.  I've never led a sprint but I am
> interested in doing so.  For it to work we would need a good backlog
> of task tickets relating to the problem.  Off the top of my head, some
> tasks:
>
>  - benchmark worker-supervisor communication process (that which sends
> test results over the wire)
>  - benchmark speed of tests runs in parallel then benchmark speed of
> same tests run in series, for comparison.  Do this for local and
> remote workers.
>  - add plenty of unit tests for all the various components involved
> (this task needs to be more specific)
>  - add plenty of functional tests for running tests in parallel on the
> same machine, on remote machines, etc
>  - add tests for many combinations of setup/teardown that could be
> supported (at the package level, at the module level, etc)
>  - perhaps implement XML output in nose core to address
> worker-supervisor communication (see
> http://code.google.com/p/python-nose/issues/detail?id=140 )
>  - create capistrano (http://www.capify.org/) recipes that install
> nose, plugins, and test suite to run on remote "worker" servers (or
> implement in python, but this would be harder)
>
> ... most of these tasks assume we will have a proof of concept
> implementation to start with.
>
> -Kumar
>
> > I suppose, I like the idea of
> > having the ability to scale an arbitrary number of virtual machines as
> > test nodes, and having the ability to control them to netboot a
> > "clean" test operating system, build a database, do test, and then
> > rebuild themselves.  That would be one heck of a Sprint.  What is
> > great, is that if you have a firewire, or USB 2 drive, you can
> > simulate something like this on your laptop, quite easily, and then
> > roll it out into production.
> >
> >
> >
> > Noah
> >
>
> _______________________________________________
> testing-in-python mailing list
> testing-in-python at lists.idyll.org
> http://lists.idyll.org/listinfo/testing-in-python
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.idyll.org/pipermail/testing-in-python/attachments/20071207/0ee69371/attachment.htm 


More information about the testing-in-python mailing list