[bip] Reproducible research

James Casbon casbon at gmail.com
Tue Mar 3 12:03:03 PST 2009


Hi Bip,

I've been thinking about reproducible computational research in
biology recently and I thought I'd drop it your way.  There seem to be
several components of this, some already recognised and some not.

Database and software tools are already known to be badly maintained:
http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000136
But that problem is very difficult.  What I am more interested here is
the day to day work of making an analysis work again and again, and
applying it to other data.

Makefiles are the obvious way of doing this, and there has been some
work around this:
http://skam.sourceforge.net/
http://biowiki.org/MakefileManifesto
AFAIK python + make = scons
http://www.scons.org/
And these guys are doing interesting stuff with scons:
http://reproducibility.org/ (but their tools are a bit domain specific
for what I want)

Then, there are the workflow engines, of which taverna seems the most
enterpisey (grid!, Web services!):
http://taverna.sourceforge.net/
Galaxy's workflows has been coming on a bit as well:
http://galaxy.psu.edu/
But you can't run them from the command line (and looking at the code,
the controller and the view are so coupled you won't be able to).  And
you can't parametrize them.

So how is BIP doing this?  I really want something simple, that can be
used at a command line or the web, and preferably in python.

James



More information about the biology-in-python mailing list