[protocols] Eel-pond/annotation questions

Jessica Perry Hekman hekman2 at illinois.edu
Tue Sep 30 13:25:31 PDT 2014


Titus -- thanks for your answers! They were very helpful. Onward:

On 9/29/14 6:02 AM, C. Titus Brown wrote:

> The older BLAST is functionally equivalent to the later BLAST, and I'm
> pretty sure all of our code works for the older BLAST.  If and when we
> update we'll have to put some effort into validation.  So... short answer
> is that being conservative isn't always bad ;).

Installing the older BLAST was pretty straightforward (phew). Finding it 
wasn't immediately obvious, but I did dig it up here (URL provided in 
case you want to add it to the documentation):

ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/LATEST/blast-2.2.26-x64-linux.tar.gz

I modified the commands to formatdb to build both databases as 
nucleotide not protein, since (as mentioned in my previous email) I'm 
using dog transcriptome instead of mouse proteome:

/usr/local/khmer/blast*/bin/formatdb -i mouse.protein.faa -o T -p F
/usr/local/khmer/blast*/bin/formatdb -i hyp-trans-gg-grouped.fasta -o T -p F

(So -p F instead of -p T in the first command.)

However, when I then ran blastall, it exited immediately without an 
error message:

/usr/local/khmer/blast-2.2.26/bin/blastall -i mouse.protein.faa -d 
~/transcriptome/hypothalamus/references/hyp-trans-gg.fasta -e 1e-3 -p 
blastx -o dog.x.gg -a 8 -v 4 -b 4

Some debugging determined that it does this silent exit when it can't 
find the database to query (i.e. if I give it "-d foo" it behaves the 
same way). Moreover, when I build the mouse.protein.faa database a) with 
formatdb -p T or b) without specifying -p at all, then the subsequent 
blastall command DOES run.

Building formatdb with no -p option seems like an acceptable workaround 
but I'm a little concerned that "formatdb -p F" doesn't work! Maybe this 
is just a blastall bug and the workaround is fine, though.

Your insights welcomed, but if you have none, then hopefully this is 
still useful as documentation for the next person to try this approach. 
When blastall finishes running we'll see if the output is as expected...


Jessica


-- 
Jessica P. Hekman, DVM, MS
PhD student, University of Illinois, Urbana-Champaign
Animal Sciences / Genetics, Genomics, and Bioinformatics
https://www.impactstory.org/jphekman



More information about the protocols mailing list