[bip] Bioinformatics Programming Language Shootout, Python performance poopoo'd

Tue Feb 5 22:24:44 PST 2008

This is my last posting on details.  :)

Annoyances:

> Python and Perl are often called script languages and when  
> executed, are compiled in an intermediate representation without  
> creating an intermediate file (syntax tree in Perl and byte code in  
> Python) and then interpreted.

> Java and C# are semi-compiled languages using automatic memory  
> management. A Java program is compiled in an intermediate-level  
> code or bytecode then it is run by either an interpreter or  
> compiler at runtime, in this case, the Java Virtual Machine (JVM).

So the difference between a script language and a semi-compiled  
language is that the byte code for the main program isn't saved to an  
intermediate file.  Not a useful distinction.

> whereas objects in Python are implemented as hash tables.

This is ... wrong?  PyObject can have a __dict__ which stores  
instance variables.  But it can also use __slots__, and doing that  
can save quite a bit of memory and performance.  See

http://www.dalkescientific.com/writings/diary/archive/2006/03/19/ 
class_instantiation_performance.html

and

http://www.dalkescientific.com/writings/ 
useful_and_new_python_modules.pdf
   (search for "__slots__")

If someone wants to time the NJ.py file, it'll probably be faster and  
take less memory as

class Node(object):
         __slots__ = "name right left right_length left_length  
divergence".split()
         def __init__(self,n=""):
                 self.name=n
                 self.right=None
                 self.left=None
                 self.right_length=0.0
                 self.left_length=0.0
                 self.divergence=0.0

Bruce:
> Looking at the code for C (and pretending to understand Perl), I  
> don't think that the same algorithm is being implemented in each  
> language!

For examples, the Perl FASTA reader reads line-by-line while the  
Python one reads the entire file into memory.

The Perl and Python readers do

     $rec->{'name'}=substr($_,1,20);
and
                 header=line[0:21]
(which should be a line[1:21]).

while the C reader copies the entire line into 'name' and doesn't  
restrict itself to the first 20 characters (why is this truncation  
there?)

             header = (char*)malloc(sizeof(char) * size_line);
             strcpy(header, line+1); // +1 to avoid > at the beginning
             header=replace(header, ',', ' ');
             size_header=size_line-1;

and the C++ code does

           s.name=header;

You also pointed out:
> For example, the readFasta.py involves a list of classes that I don't
> think is done in the other languages

perl uses a hash (not a blessed one)
c uses a typedef (in reader.h)
java uses a 'Sequence' class
c# also uses 'Sequence' class

BTW, using a __slots__ on the Python Sequence class gives a 10%  
performance speedup.

The paper talks about the memory usage for the alignment code.  By  
changing

     F = [[0.0 for x in xrange(m+1)] for y in xrange(n+1)]
to
     import array
     F = [array.array("l", [0]*(m+1)) for y in xrange(n+1)]

I take a roughly 10% performance hit (20% if I use the correct "0"  
instead of the given "0.0") for about a 40% memory savings

   PID COMMAND      %CPU   TIME   #TH #PRTS #MREGS RPRVT  RSHRD   
RSIZE  VSIZE
26636 Python      99.2%  0:17.73   1    14   190   164M+ 5.20M   166M 
+  209M+
26638 Python      98.6%  0:21.77   1    14   185  87.9M  5.27M   
90.0M   129M

(I checked and nothing else uses an int '0' instead of a float '0.0'  
like the benchmark uses for Python.  That saved about 10% in my  
earlier tests.)

				Andrew
				dalke at dalkescientific.com