[pygr-notify] Issue 44 in pygr: tblastn and blastx support
codesite-noreply at google.com
codesite-noreply at google.com
Tue Jan 6 20:36:29 PST 2009
Comment #2 on issue 44 by cjlee112: tblastn and blastx support
http://code.google.com/p/pygr/issues/detail?id=44
tblastn case is simplest:
- protein query vs. nucleotide db, returned as protein-protein alignments
- for each hit, just consolidate all the subject intervals into a single ORF
annotation (TranslationAnnot), and save the alignment of the query protein
to slices
of the TranslationAnnot.
- user accesses the alignment results as usual, but gets one added feature:
the
aligned protein intervals have a sequence attribute that yields the
associated
nucleotide sequence interval from the nucleotide database.
Simple, clean.
By contrast, blastx poses some hard problems, because we no longer have
*one* query
sequence as our alignment interface naturally assumes. The query sequence
might be
represented by any number of different ORFs, some of which might show up in
many
individual blast hits. It's not clear whether users will want these to be
reported
as one sequence or different sequences (this is especially tricky if you
have
non-overlapping intervals of a common ORF). Maybe I should just give
blastx a
different interface, one that just follows BLAST's interface: a list of
hits. That
could be quite simple: for each hit just follow the same recipe above of
consolidating into one annotation and reporting aligned slices vs. subject
database
sequence slices. Actually tblastx could also be done as a hybrid of the
blastx +
tblastn solutions I proposed above.
--
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings
More information about the pygr-notify
mailing list