[protocols] Eel-pond/annotation questions

C. Titus Brown ctb at msu.edu
Mon Sep 29 04:02:56 PDT 2014


On Sun, Sep 28, 2014 at 06:34:01PM -0500, Jessica Perry Hekman wrote:
> Hi, Titus. I've been struggling with a de novo Trinity assembly for  
> quite a few months now, and this weekend discovered your blog and the  
> khmer protocols. I'm trying to apply them to my own process, and I have  
> a fair amount of computing power locally, so I'm not running on an AWS  
> server -- I'm hoping you're still willing to take a shot at some of my  
> questions!

Sure ;)

> I'm doing RNA-seq with fox (Vulpes vulpes) which does not have a  
> published genome or transcriptome, but which aligns very well against  
> dog. Dog DOES have a published genome (fairly well annotated) and  
> transcriptome.
>
> I am happy to provide more details about my process so far, but in the  
> interest of not writing an incredibly long introductory letter, right  
> now I am trying to use your annotation protocol from eel-pond.

Cool!

> * I see you are BLASTing against mouse protein. Since I have the closely  
> related dog transcriptome, could I BLAST against that instead, and still  
> expect to use your scripts /make-uni-best-hits.py and  
> make-reciprocal-best-hits.py?

Yes, but (confusingly) I would suggest naming everything 'mouse' --
that is, download the dog database into mouse.protein.faa.  Several of the
filenames are hardcoded enough that it's hard to change 'em...

> * My system does not have formatdb for building the BLAST database, and  
> when I went to install formatdb, I discovered that it is deprecated in  
> favor of makeblastdb. However, apparently makeblastdb uses different  
> parameters so I can't use your protocol with it exactly. I'm game to go  
> try to figure out the change in parameter syntax between the two tools,  
> but I'm curious why your protocol uses a deprecated tool. Should I be  
> figuring out how to install formatdb instead of trying to make  
> makeblastdb work?

The older BLAST is functionally equivalent to the later BLAST, and I'm
pretty sure all of our code works for the older BLAST.  If and when we
update we'll have to put some effort into validation.  So... short answer
is that being conservative isn't always bad ;).

> Finally, thanks so much for all the work you've done to make RNA-seq a  
> little easier. It has been quite the frustrating process so far.

Welcome -- and let us know how we can help!

> https://www.impactstory.org/jphekman

nice!

cheers,
--titus
-- 
C. Titus Brown, ctb at msu.edu



More information about the protocols mailing list