[khmer] Labeled de Bruijn graphs
Guillaume Holley
gholley at cebitec.uni-bielefeld.de
Mon Nov 16 07:28:12 PST 2015
Dear khmer developers and users,
I am trying to use khmer to build the labeled de Bruijn graph of few
hundreds bacterial strains represented as reads in FASTQ files. More
precisely, I am trying to achieve the following tasks:
- Build the label de-bruijn graph.
- Query individual k-mers for their labels.
- Query individual k-mers for the number of successors and predecessors
they have in the graph.
I spent some time reading the khmer documentation and the blog post
http://ivory.idyll.org/blog/2015-wok-labelhash.html. Regarding this
latter one, I have downloaded and installed the code to replicate the
results but also read the code which gave me a first idea on how to
achieved some tasks mentioned above.
However, I have few questions, maybe very naive since I am not very
familiar with khmer.
- Is there any documentation for the API "labelhash"/"graphlabels"? I
browsed the online documentation of khmer but could not find anything
about it (maybe I just missed it). Do you know some examples different
from the experiment mentioned in
http://ivory.idyll.org/blog/2015-wok-labelhash.html ?
- I saw in the khmer v2.0 annoucement that "labelhash" was renamed
"graphlabels". However I could not find anything in the khmer code about
"graphlabels", only "labelhash", so should I continue using "labelhash"?
- the line "lh = khmer.LabelHash(args.ksize, args.tablesize,
args.n_tables)" is used to build a labelhash, is there a way to specify
we would like to build the graph from k-mers that occur a minimum of x
times in their files ?
- In the example given on the blog to replicate the results, a labelhash
graph is built from concatenated FASTx files. Is there a way to build
the graph iteratively from FASTx files, without having to concatenate them ?
- I am currently working with k sizes bigger than 32, is there a way to
work with khmer such that k > 32 ?
Thank your very much in advance for your time and help.
Best regards,
Guillaume Holley, PhD Student
Faculty of Technology, Genome Informatics group, Bielefeld University
Bielefeld, Germany
More information about the khmer
mailing list