[khmer] Diginorm with normalize-by-median-pct.py script
C. Titus Brown
ctb at msu.edu
Wed Apr 24 12:52:52 PDT 2013
On Wed, Apr 24, 2013 at 03:47:36PM -0400, Howard W. Fescemyer wrote:
> Dear Titus:
>
> After running your normalize-by-median-pct.py script, two sequence read
> containing files were obtained in the output; one with the ".keep"
> extension and the other with the ".keepmedpct" extension.
hi Howard,
the .keepmedpct file is produced by normalize-by-median-pct.py script from
https://github.com/ctb/khmer/blob/trinity/sandbox/normalize-by-median-pct.py
while the .keep file comes from straight ol' diginorm.
> Please let me know which of these files contains my normalized read
> data. The reason I ask is because both files have fastq formatted
> reads, but contain very different numbers of reads.
>
> I started with about 215 M read pairs. The".keepmedpct" file has about
> 17 M read pairs, while the ".keep" file has only about 0.4 M read pairs.
>
> Here is my run command; normalize-by-median-pct.py -p -C 30 -k 25 -N 4
> -x 4e9 WBtrmd_AllR1R2_modinter.fastq.
That looks about right!
> Here are some other normalization outcomes for comparison; 1) Diginorm
> using normalize-by-median (C = 5, k = 25) outputs about 13.5 M read
> pairs, 2) Trinitynorm (max_cov = 30, min_khmer_cov = 2, k = 25) outputs
> about 9.6 M read pairs, and 3) Trinitynorm (max_cov = 5, min_khmer_cov =
> 2, k = 25) outputs about 9 M read pairs.
>
> I am in the process of assembling the Trinity normalized data so I can
> compare it with the assembly using data from Diginorm using
> normalize-by-median. It would be great to include in my comparison data
> from Diginorm using normalize-by-median-pct.
I do not think the C=5 data will be worth using from any of those... For
RNAseq and Trinity, we generally recommend doing a single pass to C=20.
Lower than that and you will start to accumulate errors, and you will also
get bad assemblies from Trinity.
See:
http://khmer.readthedocs.org/en/latest/guide.html
for our guidelines.
cheers,
--titus
--
C. Titus Brown, ctb at msu.edu
More information about the khmer
mailing list