[khmer] one pass vs three pass approach to digital normalization

C. Titus Brown ctb at msu.edu
Thu Mar 28 18:56:32 PDT 2013


On Thu, Mar 28, 2013 at 09:44:51PM -0400, liz wright wrote:
> I am in the process of doing contig assembly on a venom duct transcriptome
> from a marine venomous snail and have in excess of 500 million reads if I
> combine data fron two illumina lanes.  I don't have a real way to estimate
> the depth of coverage required, but based on an ad hoc survey of the
> literature, I think this is significantly in excess of what is required for
> a transcriptome assembly.
> 
> Having just read paper (Reference free algorithm etc) I have gleaned that
> single pass is recommended for transcriptome work in order to avoid the
> loss of meaningful transcripts due to an overly stringent approach.
>  However, if I have a really significant saturation with 500 mil+ reads
> would a 3 pass procedure still be recommended?
> 
> Also I am a little concerned about "polymorphism" of my transcripts - these
> venom ducts are reported to produce up to 200 different toxins that exhibit
> variant like qualities and there may also be a fair amount of mRNA
> procession leading to isoforms.  does this make my transcriptome an
> unlikely candidate for success this normalization?

As you say, single-pass is what we recommend for transcriptome.  Don't
know what to tell you about overly stringent approach, except to say
that even with that high coverage I would expect isoforms to vanish,
because of the differential in abundance between true k-mers on high-abundance
exons and low-abundance exons.

We have other approaches in the works that will address this, but they're
not ready yet...

My general advice is, "try it".  It's usually significantly faster and
cheaper (in terms of memory) than anything else.  If you find you get
far fewer transcripts out, well, then don't use it :)

We do have some approaches that can help you compare transcriptome assemblies.
Drop me a line if you get to the point of having multiple assemblies to
compare :)

cheers,
--titus
-- 
C. Titus Brown, ctb at msu.edu




More information about the khmer mailing list