[bip] Blog post on bioinformatics and Python

Ryan Raaum ryan.raaum at gmail.com
Wed Sep 17 13:49:51 PDT 2008


> Are you arguing for requiring everything that has a C-implementation
> to also have a (slower) Python implementation? I don't think I'd agree
> with that having to be a mandate.

I think there are 3 categories of modules here:
1. Things that need to be written in C
2. Things that don't need to be written in C, but are.
3. Things that could work well either way depending on your problem.

First, there are things that NEED to be in C to be worth using, so by
all means, write a C extension.

Second, there are also modules that do not need to be in C, but are.
Unfortunately, I don't have a great example at my fingertips. There
was a series of blog posts on motif-matching in python, as well as
similar examples showing that, for at least some problems (1) a poorly
written python implementation sucks, (2) C is fast, but (3) a properly
implemented python implementation can be almost or as fast as C. This
is clearly not true for all problems, but I think - since we're trying
to work in python in the first place for all the advantages that a
higher-level language has - that there should be a STRONG push against
C extensions. There are probably many python C extension modules that
don't really need to be in C, but the person who wrote them has it
stuck in their mind that python can't be fast, and since their quick
test python implementation was slow, it has to go to C.

Finally, third, there are some problems where the python
implementation may never be super fast, but is not so slow that it
causes problems when you're not doing a lot of it. So, if there is
some task that is a little slow in the pure python version, I'd rather
just use that if I need to do that task 5 or 10 times. If I need to do
the task 50,000 times, then I need to have a faster version, which may
need to be written in C.

-Ryan



More information about the biology-in-python mailing list