[bip] Future of bioinformatics in python..?

Peter Clarke resurgo at gmail.com
Wed Aug 8 08:26:44 PDT 2007


I agree completely that large monolithic projects don't work well.
What I was suggesting was something quite opposite, similar to Sage
(www.sagemath.org), which acts as a graphical/web front end, but with
a lot of smaller packages behind it.

See http://www.sagemath.org/packages.html for the packages that sage
provides - even biopython is in the optional section.

I was thinking more along the lines of a front end that supports
graphical sequence/alignment  manipulation (bx/pygr backend),
phylogenies (rpy/bioconductor/phylip/paup backend), network analysis
(networkx/rpy..) and microarray data analysis (rpy/bioconductor..),
sloppycell for biochemical network modelling and systems biology
integration and including biopython, scipy, numpy etc, ipython and
ipython1 (for clusters) provided as a single downloadable working
system. A single graphical interface with the ability to call on
powerful external packages..

Modules for conversion between the different data formats could be
written to integrate things. With further iterations common data
standards could emerge out of the project that newer versions of the
component packages could agree to support.

I think there are enough of the people behind python in biology here
to drive this. Given the will, and people being able to work together,
it could lead to python becoming the dominant programming language for
biology.

Sage works well on the combination of web collaboration and sprints,
and there are, by now, enough people who understand enough of both
biology and programming to develop the APIs, data standards etc.

Biology is progressing at a phenomenal rate, which is all the more
reason to have a platform where people can quickly implement new
methods using powerful mathematical/bioinformatic components and be
able to deliver these tools to people who are more comfortable with a
mouse than the command line. There will always be sequences, networks,
array type data and the rest.

The problems of scalability are being addressed by the SciPy people.
Ipython1 is a useful tool for doing stuff on large clusters and these
tools are only going to become more powerful.

-Peter



More information about the biology-in-python mailing list