[khmer] Fwd: Fwd: Filter Read sequence based on the kmer abundance

C. Titus Brown ctb at msu.edu
Sun Sep 14 03:41:22 PDT 2014


On Sun, Sep 14, 2014 at 10:24:14AM +0800, Raimi Mohamed Redwan wrote:
> Hi,
> 
> I used for both the kmer of 17. I followed the one in blog you directed
> (i.e. load-into-counting.py -x 1e8 -k 17 reads.kh reads.fa |
> abundance-dist.py -s reads.kh reads.fa reads.dist  |
>  ./plot-abundance-dist.py reads.dist reads-dist.png ).
> 
> Thank You

Hi Raimi,

was there any statement at the end of load-into-counting about a false
positive rate?  You probably will need to increase the counting table
size (-x) -- see:

http://khmer.readthedocs.org/en/v1.1/choosing-table-sizes.html

for some guidance, and

http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0101271

for too much information.

best,
--titus

> On Fri, Sep 12, 2014 at 7:05 PM, C. Titus Brown <ctb at msu.edu> wrote:
> 
> > On Tue, Sep 09, 2014 at 09:32:49AM +0800, Raimi Mohamed Redwan wrote:
> > > I was trying the khmer to divide the reads based on the coverage as per
> > > directed in your blog, but I was stuck due to one thing.
> > >
> > > Initially when I always have used Jellyfish to perform kmer counting, and
> > > then to give me a kmer abundance graph and from here I realized that my
> > > data have heterozygosity issue due to the presence of double peaks after
> > > the erroneous error peak.  However, upon doing the abundance plotting
> > using
> > > the abundance-dist.py and plot-abundance-dist.py, the heterozygote peak
> > is
> > > now gone and the repeat goes so much higher that what was seen with the
> > > Jellyfish.
> > >
> > > Attached here is the kmer-abundance plot that I have got from the
> > jellyfish
> > > and khmer.
> > >
> > > Furthermore, there were two lines in the kmer abundance graph that I
> > > plotted, what is the second one referring to?
> >
> > Hi Raimi,
> >
> > do you know what k-mer size the Jellyfish analysis used?
> >
> > And you could tell me which commands you were using for khmer? Was it
> > from this blog post:
> >
> > http://ivory.idyll.org/blog/2014-slice-reads-by-coverage.html
> >
> > and did you use abundance-dist, or calc-median-distribution?
> >
> > thanks,
> > --titus
> >
> > >
> > >
> > > --
> > > Raimi Mohamed Redwan
> > > Biotechnology Research Institute
> > > Universiti Malaysia Sabah
> > > Jalan UMS
> > > 88400, Sabah
> > > Malaysia
> > > 0126707944
> >
> >
> > > _______________________________________________
> > > khmer mailing list
> > > khmer at lists.idyll.org
> > > http://lists.idyll.org/listinfo/khmer
> >
> >
> > --
> > C. Titus Brown, ctb at msu.edu
> >
> 
> 
> 
> -- 
> Raimi Mohamed Redwan
> Biotechnology Research Institute
> Universiti Malaysia Sabah
> Jalan UMS
> 88400, Sabah
> Malaysia
> 0126707944
> 
> 
> 
> -- 
> Raimi Mohamed Redwan
> Biotechnology Research Institute
> Universiti Malaysia Sabah
> Jalan UMS
> 88400, Sabah
> Malaysia
> 0126707944

> _______________________________________________
> khmer mailing list
> khmer at lists.idyll.org
> http://lists.idyll.org/listinfo/khmer


-- 
C. Titus Brown, ctb at msu.edu



More information about the khmer mailing list