[khmer] one pass vs three pass approach to digital normalization

Thu Mar 28 18:44:51 PDT 2013

Hi,

I am in the process of doing contig assembly on a venom duct transcriptome
from a marine venomous snail and have in excess of 500 million reads if I
combine data fron two illumina lanes.  I don't have a real way to estimate
the depth of coverage required, but based on an ad hoc survey of the
literature, I think this is significantly in excess of what is required for
a transcriptome assembly.

Having just read paper (Reference free algorithm etc) I have gleaned that
single pass is recommended for transcriptome work in order to avoid the
loss of meaningful transcripts due to an overly stringent approach.
 However, if I have a really significant saturation with 500 mil+ reads
would a 3 pass procedure still be recommended?

Also I am a little concerned about "polymorphism" of my transcripts - these
venom ducts are reported to produce up to 200 different toxins that exhibit
variant like qualities and there may also be a fair amount of mRNA
procession leading to isoforms.  does this make my transcriptome an
unlikely candidate for success this normalization?

thanks,

 Liz Wright
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/khmer/attachments/20130328/a8fd0abc/attachment-0002.htm>