[bip] Bioinformatics software design

Titus Brown titus at caltech.edu
Tue Feb 19 21:51:27 PST 2008


Hi, Nathan,

you've already got a bunch of good responses, but I thought I'd chime
in.

On Sun, Feb 17, 2008 at 10:32:52PM +0000, Nathan Harmston wrote:
-> Hi,
-> For one of my projects I/a group of us will be developing some Systems
-> Biology modelling software by integrating Python, C/C++ using Ctypes and
-> maybe doing some R integration. And although it hasn't started I was
-> thinking about the design of the "API" and the methodology to be used as
-> well. So I thought I'd ask a few questions:
-> 
-> 1. What would you consider to be a good "API"?
->           well-documented, intuitive, best guess works, pythonic, do people
-> have any examples of what they consider a good API/project?

I like twill (of course ;) and Quixote, but neither may be particularly
appropriate for your needs.  twill exposes a procedural interface as
well as an OO interface, making it easy to get started and then progress
to a more complicated set of interactions.  Quixote is small, compact,
and simple; you can grasp all of Quixote in an afternoon.

matplotlib might be worth taking a look at, actually; it also has a
simple (but inflexible) procedural interface coupled to a more complex
and flexible 

My biggest piece of advice would be to NOT FIX YOUR API until you've
gone through a bunch of actual use cases and external user interactions.
The earlier you set something in stone, the more you're going to have to
sweat when you discover that you were wrong and need to change it...

-> 2. Testing?
->           although I believe testing is very important I have not really
-> gone for a hardcore TDD approach before and am thinking I should do it on
-> this project. What frameworks do people suggest are useful, and how would
-> you test a function whose output was random/stochastic modelling, since it
-> is obviously random?

nose or py.test over unittest, IMO.

I don't know how to test stochastic packages for correctness.  But I would
probably start by creating a bunch of regression tests based on fixed
random number generator seeds, if only so that you know when your code
changes result in behavior changes.

-> 3. Would anyone like to suggest any problems they've found in developing
-> software for the Bioinformatics/Systems Biology user? I don't like pretty
-> interfaces and prefer to keep it simple and powerful and unfortunately
-> biologists like pretty things.

Statistically speaking, it is unlikely that anyone other than you or
your collaborators will ever use your software.  I think you can
increase your chances of producing something *used* by providing decent
"getting started" docs, simple install instructions, and a bunch of
good, maintained example scripts/projects.

Pretty interfaces may be a necessary evil...

cheers,
--titus



More information about the biology-in-python mailing list