[bip] Announcing "Ruffus": Easy Computational Pipelines in Bioinformatics

leo.goodstadt at dpag.ox.ac.uk leo.goodstadt at dpag.ox.ac.uk
Mon Jun 1 08:00:10 PDT 2009


======================================================
Ruffus is a lightweight python module supporting Bioinformatics pipelines.

http://ruffus.googlecode.com/svn/trunk/doc/html/index.html
======================================================

Many of us would like to run own analyses reproducibly but are put off the
use of tools like scons and make because of their indigestible syntax.

Ruffus helps you to manage the stages or tasks in your analyses. However,
because it is designed from scratch for bioinformatics pipelines (rather
than for compiling programmes), no ugly syntax / hacking is required.

Ruffus is designed to allow this to happen in python in the least intrusive
manner possible, and without any special
(non-python) syntax or magic.


Automatic support is provided for

   * Managing dependencies
   * Running tasks in parallel
   * Re-starting from arbitrary points, especially after errors
   * Display of the pipeline as a flowchart
   * Reporting



We have tried to keep the feature focused so that it is maximally useful but
doesn't have too steep a learning curve. 

So far it seems to do the simple things relatively simply while still
supporting the sort of pipeline which cause makefile/ scons to go all
cross-eyed and recursive. 
(E.g. multiply forking and rejoining dependencies, indeterminate number of
intermediate files, using sub-directories based on regular expression
matches etc.) 


It is a unambitiously lightweight library which tries to do one small thing
well.


We have been using Ruffus in-house but we would like to see what other
scientists think of the design before we bring its attention to a wider
audience. (We intend to publish this as an application note in some
bioinformatics journal)

Please help us by sending feedback to:
   ruffus_lib at llew <dot> org  <dot> uk


Extensive documentation, download details and two tutorials are available
from http://ruffus.googlecode.com/svn/trunk/doc/html/index.html.


Leo Goodstadt
MRC Functional Genomics Unit
University of Oxford



P.S. This can be seen as a very belated response to the discussion in april
on reproducible research.
http://lists.idyll.org/pipermail/biology-in-python/2009-March/000422.html







More information about the biology-in-python mailing list