[ngs] seqtk

Istvan Albert iua1 at psu.edu
Tue Jun 18 07:35:44 PDT 2013


a quick writeup based the impromptu presentation this morning

seqtk: trim and process reads at insanely high speed - brought to you
by the Church of Heng Li

https://github.com/lh3/seqtk

For just about all tools the input can be fasta, fastq file, may be
gzipped or not, will unzip on the fly.

# extracts a random sample
seqtk sample

# apply a seed to extract the same reads from two, paired end files
seqtk -s 10 sample

# trim reads with the modified Mott trimming algorithm
seqtk trimfq ...

The algorithm is described on this page, find the section Algorithm for details:

http://www.phrap.org/phredphrap/phred.html

Beyond these there are other interesting features - you can subtract
subsequences from file (say you want to extract a certain part of your
reference genome) etc


--
Istvan Albert
Associate Professor, Bioinformatics
Pennsylvania State University
http://www.personal.psu.edu/iua1/



More information about the ngs-2013 mailing list