<div dir="ltr"><div>[ + including <a href="mailto:khmer@lists.idyll.org">khmer@lists.idyll.org</a> to share the knowledge ]</div><div><br></div>Hey Daniel,<div><br></div><div>You're right, this is a know issue. You'll have to rename or softlink your <a href="tel:7123.6.62157" value="+17123662157" target="_blank" style="font-size:13px;font-family:arial,sans-serif">7123.6.62157</a><span style="font-size:13px;color:rgb(0,0,0);font-family:arial,sans-serif">.TATAAT.adnq.</span><span style="font-size:13px;color:rgb(0,0,0);font-family:arial,sans-serif">fastq.keep file to have a ".fastq" or ".fq" extension.</span></div>
<div><span style="font-size:13px;color:rgb(0,0,0);font-family:arial,sans-serif"><br></span></div><div><span style="font-size:13px;color:rgb(0,0,0);font-family:arial,sans-serif">Our apologies for the error.</span></div></div>
<div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Feb 12, 2014 at 4:42 PM, Daniel Burkhardt <span dir="ltr"><<a href="mailto:burkhardt.d.b@gmail.com" target="_blank">burkhardt.d.b@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hello again Michael,<div><br></div><div>So I've gotten through the Khmer tutorial and I'm on to normalizing my data. After my first round of digital normalization, I'm getting an error when I generate a histogram of the kmers. I think it's a header issue, and I know theres a note about this on the website where Khmer doesn't like fastq files.</div>
<div><br></div><div>Here's the error</div><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"><div><br></div><div>(env)dan@computobacter:~/khmer/QCtest$ python /home/dan/khmer/env/bin/abundance-dist.py PHH6-dn.kh <a href="tel:7123.6.62157" value="+17123662157" target="_blank">7123.6.62157</a>.TATAAT.adnq.fastq.keep PHH6-dn.hist</div>
<div><div>hashtable from PHH6-dn.kh</div></div><div><div>K: 20</div></div><div><div>HT sizes: [16000000039L, 16000000067L, 16000000091L, 16000000097L]</div></div><div><div>outputting to PHH6-dn.hist</div></div><div><div>
preparing hist...</div>
</div><div><div>terminate called after throwing an instance of 'khmer::read_parsers::InvalidFASTAFileFormat'</div></div><div><div> what(): InvalidFASTAFileFormat: invalid sequence name indicator: @HISEQ05:329:C22F1ACXX:6:1101:1307:2242</div>
</div><div><div>Aborted (core dumped)</div></div></blockquote><div><br></div><div>In the file 7123.6. ... .fastq.keep the header is only marginally different than in the original fastq file. The name indicator refers to the first sequence in the dataset</div>
<div><br></div><div>In the original .fastq</div><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"><div>@HISEQ05:329:C22F1ACXX:6:1101:1307:2242 1:N:0:TATAAT</div><div>ACCTCGTCTGCCTGGCGCTGCGTGACGATTACCACTGCGTCGCGCTCGACCAGCGCGGCCACGGCGACAGCGACTG</div>
<div>GTCGCACGACGCCGACTACACAATGGGCGCGCAGCTCGCCGACACGAAGGGATTTGGCGACCATCTCCGCCTCG</div><div>+</div><div>=@?BDBDDHFHHBH@D?FFEDG8CGHDA@FFGDGGHD@@7;FGHEEC>B@BBBBBB5@BBBB7005;>B>BBBB39A(:5<>BBBB59@B>@5>CCC3<(4@+2050<>><>CB@B##################################</div>
<div><br></div></blockquote><div><br></div><div>In the .keep file</div><blockquote style="margin:0px 0px 0px 40px;border:none;padding:0px"><div>@HISEQ05:329:C22F1ACXX:6:1101:1307:2242</div><div>ACCTCGTCTGCCTGGCGCTGCGTGACGATTACCACTGCGTCGCGCTCGACCAGCGCGGCCACGGCGACAGCGACTGGTCGCACGACGCCGACTACACAATGGGCGCGCAGCTCGCCGACACGAAGGGATTTGGCGACCATCTCCGCCTCG</div>
<div>+</div><div>=@?BDBDDHFHHBH@D?FFEDG8CGHDA@FFGDGGHD@@7;FGHEEC>B@BBBBBB5@BBBB7005;>B>BBBB39A(:5<>BBBB59@B>@5>CCC3<(4@+2050<>><>CB@B##################################</div><div><br>
</div></blockquote><div>The only difference seems to the "1:N:0:TATAAT" section of the header. In the .keep file, none of the reads have this part of the header. As far as I can tell, this section tells assemblers which paired end of the sample the read belongs to.</div>
<div><br></div><div>Why is Khmer throwing an invalid fasta format error? What can I do to fix this?</div><div><br></div><div>Thanks,</div><div>Dan</div><div><br></div></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra">
<br><br><div class="gmail_quote">
On Thu, Jan 30, 2014 at 9:19 AM, Michael R. Crusoe <span dir="ltr"><<a href="mailto:mcrusoe@msu.edu" target="_blank">mcrusoe@msu.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<p dir="ltr">A git clone will indeed work. I will find a solution for other Ubuntu 12.04 users who are using the version that canonical ships.</p>
<p dir="ltr">FYI, you can upgrade pip for a single user with the following command:</p>
<p dir="ltr">pip install --upgrade --user pip</p>
<p dir="ltr">Cheers,</p><div><div>
<div class="gmail_quote">On Jan 30, 2014 8:16 AM, "Daniel Burkhardt" <<a href="mailto:burkhardt.d.b@gmail.com" target="_blank">burkhardt.d.b@gmail.com</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr"><div>So unfortunately there is no --no-clean option for pip version 1.1. However, it seems to me that the only issue is that the example files are in a src directory that pip is "cleaning up" as part of the installation. I have all the scripts that I should need for the tutorial, just not the sample data set. I think I might just grab the data directory from your git page and try to run things from there. This sounds like it should work, yeah?<br>
<br></div>Thanks,<br>Dan<br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Jan 29, 2014 at 10:23 AM, Michael R. Crusoe <span dir="ltr"><<a href="mailto:mcrusoe@msu.edu" target="_blank">mcrusoe@msu.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><p dir="ltr">Yep, the command would be</p>
<p dir="ltr">pip install --no-clean khmer</p>
<p dir="ltr">You might need to use a fresh virtualenv.</p>
<p dir="ltr">There is a chance that your version of pip doesn't have that option. If so I will find another way.</p>
<p dir="ltr">Thank you for your patience.</p><div><div>
<div class="gmail_quote">On Jan 29, 2014 9:15 AM, "Daniel Burkhardt" <<a href="mailto:burkhardt.d.b@gmail.com" target="_blank">burkhardt.d.b@gmail.com</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr"><div><div><div><div><div><div><div>Michael,<br><br>Thank you for your quick response. Inside my virtualenv, I have pip version 1.1. When you suggest to add the --no-clean flag, are you referring to when I install khmer? It doesn't look like this is an option for pip. <br>
<br></div>Inside of /env/ I have four directories: bin include lib local<br></div>bin contains the scripts for khmer and for the virtualenv<br></div>include contains a link to python 2.7<br></div>lib contains a directory of python 2.7<br>
</div>local contains links to the 3 directories above (bin, include, lib)<br><br></div>I cannot find a build/khmer directory anywhere on my system.<br><br></div><div>Thanks again,<br></div><div>Dan<br></div><div><br></div>
P.s. this is a full list of the contents of /env/bin:<br> abundance-dist.py easy_install load-into-counting.py<br>abundance-dist-single.py easy_install-2.7 make-initial-stoptags.py<br>activate extract-paired-reads.py merge-partitions.py<br>
activate.csh extract-partitions.py normalize-by-median.py<br>activate.fish filter-abund.py partition-graph.py<br>activate_this.py filter-abund-single.py pip<br>annotate-partitions.py filter-stoptags.py pip-2.7<br>
count-median.py find-knots.py python<br>count-overlap.py interleave-reads.py sample-reads-randomly.py<br>do-partition.py load-graph.py split-paired-reads.py<br><br>
<div>
<div><div><div><div><div><br></div></div></div></div></div></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Tue, Jan 28, 2014 at 5:19 PM, Michael R. Crusoe <span dir="ltr"><<a href="mailto:mcrusoe@msu.edu" target="_blank">mcrusoe@msu.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Thank you Dan for your email.<div><br></div><div>pip, the tool we use for installing khmer and its dependencies, has gone decided to break backwards compatibility with their version 1.5 release.</div>
<div><br>
</div><div>Looks like you're using an earlier version which dies if you pass it an argument it doesn't understand. I will update the docs to clarify the situation.</div><div><br></div><div>If your pip version is prior to 1.5 then install using `<span style="font-size:13px;font-family:arial,sans-serif">pip install khmer`.</span></div>
<div><span style="font-size:13px;font-family:arial,sans-serif"><br></span></div><div><span style="font-size:13px;font-family:arial,sans-serif">If you're using pip version 1.5 or later then use: `</span><span style="font-size:13px;font-family:arial,sans-serif">pip install --allow-external argparse khmer`</span></div>
<div><span style="font-size:13px;font-family:arial,sans-serif"><br></span></div><div><font color="#000000" face="arial, sans-serif">I will also be updating the instructions on how to run the tests. Right now the easiest way is to pass the '--no-clean' flag and to go into the env/build/khmer directory where you can run `make tests`.</font></div>
<div><font color="#000000" face="arial, sans-serif"><br></font></div><div><font color="#000000" face="arial, sans-serif"><br></font></div><div class="gmail_extra"><br><br><div class="gmail_quote"><div>On Tue, Jan 28, 2014 at 3:26 PM, Daniel Burkhardt <span dir="ltr"><<a href="mailto:burkhardt.d.b@gmail.com" target="_blank">burkhardt.d.b@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hello,<br>
<br>
I wasn't sure if I should email this address or the help listserv, but I figured it would be better to avoid emailing a bunch of people with a potentially novice question. My name is Daniel Burkhardt and I'm a research assistant fellow in the lab of Dr. Kristen DeAngelis at the University of Massachusetts, Amherst.<br>
<br>
The short question:<br>
<br>
I followed the instructions on the <a href="http://khmer.readthedocs.org/en/latest/install.html" target="_blank">http://khmer.readthedocs.org/<u></u>en/latest/install.html</a> page, but I have no env/src directory. So I can't do the last step that says to run the tests under src/tests. I also do not have any of the sample files to try. I've done a system-wide search, but have no files with the pattern "stamps-reads" anywhere. My question is: what did I do wrong?<br>
<br>
Some background:<br></blockquote><div> </div></div><div><snip for privacy></div><div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I am running Ubuntu 12.04. I followed the instructions on the read the docs up through:<br>
<br>
pip install --allow-external argparse khmer<br>
<br>
<br>
but I can't find a src directory under my env directory. Also it seems that the --allow-external flag is outdated, and is no longer a valid option for pip. Attached is a .log file of pip's output. All of the scripts seem to have installed correctly, but I can't figure out why I have no tests directory or any sample files. Is this problem encountered often? I couldn't find any similar problem reports on the listserv archives. Did I miss something rudimentary? I would appreciate any insight you can offer.<br>
<br>
Thank you for your time,<br>
Dan<br>
</blockquote></div></div><span><font color="#888888"><br><br clear="all"><div><br></div>-- <br><div dir="ltr"><span style="font-family:'courier new',monospace;font-size:small">Michael R. Crusoe: Programmer & Bioinformatician </span><a href="mailto:mcrusoe@msu.edu" style="color:rgb(17,85,204);font-family:'courier new',monospace;font-size:small" target="_blank">mcrusoe@msu.edu</a><br style="font-family:'courier new',monospace;font-size:small">
<span style="font-family:'courier new',monospace;font-size:small"> @ the Genomics, Evolution, and Development lab; Michigan State U</span><br style="font-family:'courier new',monospace;font-size:small"><a href="http://ged.msu.edu/" style="color:rgb(17,85,204);font-family:'courier new',monospace;font-size:small" target="_blank">http://ged.msu.edu/</a><span style="font-family:'courier new',monospace;font-size:small"> </span><a href="http://orcid.org/0000-0002-2961-9670" style="color:rgb(17,85,204);font-family:'courier new',monospace;font-size:small" target="_blank">http://orcid.org/0000-0002-2961-9670</a><span style="font-family:'courier new',monospace;font-size:small"> </span><a href="http://twitter.com/biocrusoe" style="color:rgb(17,85,204);font-family:'courier new',monospace;font-size:small" target="_blank">@biocrusoe</a><br>
</div>
</font></span></div></div>
</blockquote></div><br><br clear="all"><br>-- <br>Daniel Burkhardt<br>University of Massachusetts Amherst - Microbiology<br>Cell: <a href="tel:%28978%29-846-2197" value="+19788462197" target="_blank">(978)-846-2197</a><br>
</div>
</blockquote></div>
</div></div></blockquote></div><br><br clear="all"><br>-- <br>Daniel Burkhardt<br>University of Massachusetts Amherst - Microbiology<br>Cell: <a href="tel:%28978%29-846-2197" value="+19788462197" target="_blank">(978)-846-2197</a><br>
</div>
</blockquote></div>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br>Daniel Burkhardt<br>University of Massachusetts Amherst - Microbiology<br>Cell: <a href="tel:%28978%29-846-2197" value="+19788462197" target="_blank">(978)-846-2197</a><br>
</div>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div dir="ltr"><span style="font-family:'courier new',monospace;font-size:small">Michael R. Crusoe: Programmer & Bioinformatician </span><a href="mailto:mcrusoe@msu.edu" style="color:rgb(17,85,204);font-family:'courier new',monospace;font-size:small" target="_blank">mcrusoe@msu.edu</a><br style="font-family:'courier new',monospace;font-size:small">
<span style="font-family:'courier new',monospace;font-size:small"> @ the Genomics, Evolution, and Development lab; Michigan State U</span><br style="font-family:'courier new',monospace;font-size:small"><a href="http://ged.msu.edu/" style="color:rgb(17,85,204);font-family:'courier new',monospace;font-size:small" target="_blank">http://ged.msu.edu/</a><span style="font-family:'courier new',monospace;font-size:small"> </span><a href="http://orcid.org/0000-0002-2961-9670" style="color:rgb(17,85,204);font-family:'courier new',monospace;font-size:small" target="_blank">http://orcid.org/0000-0002-2961-9670</a><span style="font-family:'courier new',monospace;font-size:small"> </span><a href="http://twitter.com/biocrusoe" style="color:rgb(17,85,204);font-family:'courier new',monospace;font-size:small" target="_blank">@biocrusoe</a><br>
</div>
</div>