[bip] Future of bioinformatics in python..?

James Taylor james at jamestaylor.org
Fri Aug 3 08:13:18 PDT 2007


I think a big monolithic project like biopython is inherently  
difficult to maintain. All of the bio* projects have this problem,  
they grow to the point where they can't even get things in order  
enough to cut reasonably regular releases, and thus stagnate...

Rather than assembling a single "project" I think it is better to  
have a community of small projects. Even my own package ("bx-python")  
is too big I think. Small projects are much easier to maintain, and  
easier for users to adopt.

And fortunately for us, Python now has the perfect infrastructure for  
allowing projects to depend on lots of other small packages.  
Setuptools / pkg_resources make installing, updating, and using  
packages so much easier.

On Aug 3, 2007, at 11:02 AM, Andrew Dalke wrote:

> On Aug 2, 2007, at 12:24 PM, Peter Clarke wrote:
>> I think that the python bioinformatics community needs to come
>> together to form a project that can deliver something directly to the
>> end users (that caters for everyone from computationally naive
>> biologists to experienced programmers).
>
> That's what Biopython was supposed to do.  Well, it depends
> on what you call "everyone."
>
>> I think this would only really work (and could work really well)  
>> if we
>> could make it a completely collaborative effort with everyone  
>> involved
>> in Python bioinformatics.
>
> Why didn't Biopython become that project?
>
> Doing what you outline is very hard.  Designing reusable systems
> is (roughly) three times harder than designing single use systems.
> Who has the resources to devote to that?
>
> Developing for a diverse set of skills is also hard.  One technique
> in HCI is persona development, and in business there's the related
> idea of market segmentation.  Different people have different needs,
> and it's very hard to be everything to everyone.  (Neat example
> of that in the video of Malcolm Gladwell at
>    http://www.ted.com/index.php/talks/view/id/20 )
>
> One limitation in Biopython is that no one really has the
> ability to say "please change your code to make it fit in
> better with the rest of Biopython."  Partially because no one
> wants that job enough, and partially because there's no existing
> sense of unity, and partially because it would/may/might reduce
> yet further the number of people willing to contribute code.
>
> Another is that there weren't enough people involved in Biopython,
> and I don't think any were working on the same research project.
> I previously on this list compared Biopython to Bioperl, which
> I think became much better because of the EBI/Sanger efforts
> in the 1990s using bioperl.  There were many people, in the
> same spot, using bioperl for different but related work.
>
> I think that needs to happen for this sort of infrastructure
> Python project.  Or at least a large number of sprints where
> everyone got together at the same physical location.
>
>
> On the plus side, this field is small enough that if there
> were a dozen or so good programmers who worked together on
> a project then it could do amazing things.
>
> I just don't know what that project is.  Of the ones I
> thought up, I couldn't get sales numbers to work out right.
> And I'm biased enough that I want my end-users to be the
> ones paying me directly, and not indirectly through grants.
>
>
>
> 				Andrew
> 				dalke at dalkescientific.com
>
>
>
> _______________________________________________
> biology-in-python mailing list
> biology-in-python at lists.idyll.org
> http://lists.idyll.org/listinfo/biology-in-python




More information about the biology-in-python mailing list