[bip] Bioinformatics Programming Language Shootout, Python performance poopoo'd

Cory Tobin ctobin at caltech.edu
Tue Feb 5 14:36:33 PST 2008


> i just had a brief look, are the authors python users?

It doesn't look like it.  The first author Mathieu Fourment has a
LinkedIn page ( http://www.linkedin.com/in/froggy ).  It says he knows
Java, C/C++ and Perl.  The other author is a professor.  I doubt he
wrote any code.

The methodology of this paper was a complete disgrace and lacked any
scientific objectivity.  If they actually wanted to be somewhat
objective they should have found people who are adept in each of those
languages and told them to write the fastest code they could.

A lot of the amateur mistakes they make
(such as using  seq+=line   instead of   lines.append(line); "".join(lines))  )
are due to the fact that they are not career python jockeys.  But I
found many mistakes that are not python specific and were just plain
bad programming.  One example, on line 54 of parse.py he compiles a
regular expression inside of a loop.  Placing the re.compile() before
the loop could save them plenty of CPU cycles.

I got a good chuckle on line 45 of parse.py, they use the deprecated
string module to remove the newline character rather than the strip()
function.

Cory

-- 
Cory Tobin
ctobin at caltech.edu



More information about the biology-in-python mailing list