[pygr-notify] Issue 44 in pygr: tblastn and blastx support

codesite-noreply at google.com codesite-noreply at google.com
Tue Jan 6 20:36:29 PST 2009


Comment #2 on issue 44 by cjlee112: tblastn and blastx support
http://code.google.com/p/pygr/issues/detail?id=44

tblastn case is simplest:
- protein query vs. nucleotide db, returned as protein-protein alignments
- for each hit, just consolidate all the subject intervals into a single ORF
annotation (TranslationAnnot), and save the alignment of the query protein  
to slices
of the TranslationAnnot.
- user accesses the alignment results as usual, but gets one added feature:  
the
aligned protein intervals have a sequence attribute that yields the  
associated
nucleotide sequence interval from the nucleotide database.

Simple, clean.

By contrast, blastx poses some hard problems, because we no longer have  
*one* query
sequence as our alignment interface naturally assumes.  The query sequence  
might be
represented by any number of different ORFs, some of which might show up in  
many
individual blast hits.  It's not clear whether users will want these to be  
reported
as one sequence or different sequences (this is especially tricky if you  
have
non-overlapping intervals of a common ORF).  Maybe I should just give  
blastx a
different interface, one that just follows BLAST's interface: a list of  
hits.  That
could be quite simple: for each hit just follow the same recipe above of
consolidating into one annotation and reporting aligned slices vs. subject  
database
sequence slices.  Actually tblastx could also be done as a hybrid of the  
blastx +
tblastn solutions I proposed above.

--
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings



More information about the pygr-notify mailing list