[bip] Blog post on bioinformatics and Python

C. Titus Brown ctb at msu.edu
Thu Sep 18 09:27:30 PDT 2008


On Thu, Sep 18, 2008 at 10:55:05AM -0500, Bruce Southey wrote:
-> >> Use of iterators and what/how to get specific information out of
-> >> BioPython objects.
-> >
-> > Could you clarify these points please?  Are you in favour of Biopython
-> > using python iterators (e.g. via generator functions)?  And what
-> > Biopython objects in particular were you trying to extract data from?
-> >   
-> Part of it is a lack of understanding but I have not bothered to go 
-> back. So what I say is probably wrong and out of date. I do not really 
-> understand Python iterators and generators as my knowledge is still 
-> mainly Python 2.0 and have not bothered that much with the new language 
-> features. For what I wanted using .next() really was not an option 
-> because I thought that I would need to get specific entries not proceed 
-> in ordered approach. Now I definitely need to access specific entries to 
-> match across files or databases. Today I looked at at the BioPython 
-> tutorial Chapter 4 and saw SeqIO.to_dict which would have helped in that 
-> regard.

Why not:

   data = list(data)

?  That will take any iterator/generator and turn it into a list.
There's no real penalty for doing this (if you need a random access
list, then you need to fully parse the file anyway!), and you can
convert it into a dictionary pretty easily, too.

OTOH, frustration with the opaqueness of BioPython BLAST objects (and
the use of crmpd 'sbjct' vrbl nms) led me to write my own blastparser
with objecs that had a useful 'dir', were debuggable by me, had lots of
automated tests, etc.  You could take a look at that, although it is
sloooooooooooooooooooooow.

	http://darcs.idyll.org/~t/projects/blastparser/
	http://darcs.idyll.org/~t/projects/blastparser-latest.tar.gz

cheers,
--titus



More information about the biology-in-python mailing list