[bip] agile software development

Andrew Dalke dalke at dalkescientific.com
Mon Jul 30 18:50:09 PDT 2007


On Jul 30, 2007, at 5:18 PM, James Taylor wrote:
> I think this list is a great idea, and I'm interested in discussing
> how to share "infrastructure" in the biology-in-python community.
> Particularly, the merits of and problems with monolithic packages
> like biopython, and alternative models (e.g. how can we be more agile
> while still sharing common interfaces where sensible). But that is a
> discussion for another day...

It's another day now. :)

James mentioned agile development.  For details see the Wikipedia
page at http://en.wikipedia.org/wiki/Agile_software_development

Can most biology-oriented software development ever be called
agile?  I don't think so.

A big thing in agile is:
   * Customer satisfaction by rapid, continuous delivery of
       useful software

Who is the customer for most biology software?  In many,
the customer is the programmer.  This is great, so long as
that remains true.  Which won't be the case for infrastructure
projects.

But then what happens if/when the software is released?  There's
a new type of customer, who had no influence on the project.
I've heard arguments that "I'm designing it for myself, and
I'm a biologist."  My response is "but if the people using it
were like you they would write the code themselves."

Indeed, who is "the customer" of an open source biology program?
The user?  (And which kind of user?)  The PI?  The funding
agency?  A user with a problem the developers find interesting?
Agile makes the assumption that the user is the customer is
the person paying the money, but is that often the case for
most software in this field?


Another aspect of agile is:
   * Continuous attention to technical excellence and good design

Most people in this field are trained as a scientist, and
rarely as a programmer.  How do you learn what is excellent?
How do you learn good design?  How do you justify spending
the 2x or 3x more time needed to make a reusable application,
compared to a single purpose application?

And how do you do all of this when your primary job (for
grad students and research scientists) is doing science, not
software?

One approach from extreme programming is to engage in
pair programming.  But I've talked to a lot of people doing
Python who do development by themselves.  There's no one
to pair with.  Even when there are several people in the
world interested in a project, the communications is often
only through the net, which is a low-bandwidth environment
for this sort of learning.

Bioperl worked out well, I think, in large part because it
was being used at EBI/Sanger. There were many people working
together on the same project in the same geographic location,
and with the goal of supporting other people.

But that's a rare case.



				Andrew
				dalke at dalkescientific.com





More information about the biology-in-python mailing list