[khmer] Khmer install

Michael R. Crusoe mcrusoe at msu.edu
Tue Feb 18 11:12:39 PST 2014


[ + including khmer at lists.idyll.org to share the knowledge ]

Hey Daniel,

You're right, this is a know issue. You'll have to rename or softlink your
7123.6.62157.TATAAT.adnq.fastq.keep file to have a ".fastq" or ".fq"
extension.

Our apologies for the error.


On Wed, Feb 12, 2014 at 4:42 PM, Daniel Burkhardt
<burkhardt.d.b at gmail.com>wrote:

> Hello again Michael,
>
> So I've gotten through the Khmer tutorial and I'm on to normalizing my
> data. After my first round of digital normalization, I'm getting an error
> when I generate a histogram of the kmers. I think it's a header issue, and
> I know theres a note about this on the website where Khmer doesn't like
> fastq files.
>
> Here's the error
>
>
> (env)dan at computobacter:~/khmer/QCtest$ python
> /home/dan/khmer/env/bin/abundance-dist.py PHH6-dn.kh 7123.6.62157.TATAAT.adnq.fastq.keep
> PHH6-dn.hist
> hashtable from PHH6-dn.kh
> K: 20
> HT sizes: [16000000039L, 16000000067L, 16000000091L, 16000000097L]
> outputting to PHH6-dn.hist
> preparing hist...
> terminate called after throwing an instance of
> 'khmer::read_parsers::InvalidFASTAFileFormat'
>   what():  InvalidFASTAFileFormat: invalid sequence name indicator:
> @HISEQ05:329:C22F1ACXX:6:1101:1307:2242
> Aborted (core dumped)
>
>
> In the file 7123.6. ... .fastq.keep the header is only marginally
> different than in the original fastq file. The name indicator refers to the
> first sequence in the dataset
>
> In the original .fastq
>
> @HISEQ05:329:C22F1ACXX:6:1101:1307:2242 1:N:0:TATAAT
>
> ACCTCGTCTGCCTGGCGCTGCGTGACGATTACCACTGCGTCGCGCTCGACCAGCGCGGCCACGGCGACAGCGACTG
> GTCGCACGACGCCGACTACACAATGGGCGCGCAGCTCGCCGACACGAAGGGATTTGGCGACCATCTCCGCCTCG
> +
> =@?BDBDDHFHHBH at D?FFEDG8CGHDA at FFGDGGHD@@7;FGHEEC>B at BBBBBB5
> @BBBB7005;>B>BBBB39A(:5<>BBBB59 at B>@5>CCC3<(4 at +2050<>><>CB at B
> ##################################
>
>
> In the .keep file
>
> @HISEQ05:329:C22F1ACXX:6:1101:1307:2242
>
> ACCTCGTCTGCCTGGCGCTGCGTGACGATTACCACTGCGTCGCGCTCGACCAGCGCGGCCACGGCGACAGCGACTGGTCGCACGACGCCGACTACACAATGGGCGCGCAGCTCGCCGACACGAAGGGATTTGGCGACCATCTCCGCCTCG
> +
> =@?BDBDDHFHHBH at D?FFEDG8CGHDA at FFGDGGHD@@7;FGHEEC>B at BBBBBB5
> @BBBB7005;>B>BBBB39A(:5<>BBBB59 at B>@5>CCC3<(4 at +2050<>><>CB at B
> ##################################
>
> The only difference seems to the "1:N:0:TATAAT" section of the header. In
> the .keep file, none of the reads have this part of the header. As far as I
> can tell, this section tells assemblers which paired end of the sample the
> read belongs to.
>
> Why is Khmer throwing an invalid fasta format error? What can I do to fix
> this?
>
> Thanks,
> Dan
>
>
>
> On Thu, Jan 30, 2014 at 9:19 AM, Michael R. Crusoe <mcrusoe at msu.edu>wrote:
>
>> A git clone will indeed work. I will find a solution for other Ubuntu
>> 12.04 users who are using the version that canonical ships.
>>
>> FYI, you can upgrade pip for a single user with the following command:
>>
>> pip install --upgrade --user pip
>>
>> Cheers,
>> On Jan 30, 2014 8:16 AM, "Daniel Burkhardt" <burkhardt.d.b at gmail.com>
>> wrote:
>>
>>> So unfortunately there is no --no-clean option for pip version 1.1.
>>> However, it seems to me that the only issue is that the example files are
>>> in a src directory that pip is "cleaning up" as part of the installation. I
>>> have all the scripts that I should need for the tutorial, just not the
>>> sample data set. I think I might just grab the data directory from your git
>>> page and try to run things from there. This sounds like it should work,
>>> yeah?
>>>
>>> Thanks,
>>> Dan
>>>
>>>
>>> On Wed, Jan 29, 2014 at 10:23 AM, Michael R. Crusoe <mcrusoe at msu.edu>wrote:
>>>
>>>> Yep, the command would be
>>>>
>>>> pip install --no-clean khmer
>>>>
>>>> You might need to use a fresh virtualenv.
>>>>
>>>> There is a chance that your version of pip doesn't have that option. If
>>>> so I will find another way.
>>>>
>>>> Thank you for your patience.
>>>> On Jan 29, 2014 9:15 AM, "Daniel Burkhardt" <burkhardt.d.b at gmail.com>
>>>> wrote:
>>>>
>>>>> Michael,
>>>>>
>>>>> Thank you for your quick response. Inside my virtualenv, I have pip
>>>>> version 1.1. When you suggest to add the --no-clean flag, are you referring
>>>>> to when I install khmer? It doesn't look like this is an option for pip.
>>>>>
>>>>> Inside of /env/ I have four directories: bin  include  lib  local
>>>>> bin contains the scripts for khmer and for the virtualenv
>>>>> include contains a link to python 2.7
>>>>> lib contains a directory of python 2.7
>>>>> local contains links to the 3 directories above (bin, include, lib)
>>>>>
>>>>> I cannot find a build/khmer directory anywhere on my system.
>>>>>
>>>>> Thanks again,
>>>>> Dan
>>>>>
>>>>> P.s. this is a full list of the contents of /env/bin:
>>>>>  abundance-dist.py         easy_install
>>>>> load-into-counting.py
>>>>> abundance-dist-single.py  easy_install-2.7
>>>>> make-initial-stoptags.py
>>>>> activate                  extract-paired-reads.py  merge-partitions.py
>>>>> activate.csh              extract-partitions.py
>>>>> normalize-by-median.py
>>>>> activate.fish             filter-abund.py          partition-graph.py
>>>>> activate_this.py          filter-abund-single.py   pip
>>>>> annotate-partitions.py    filter-stoptags.py       pip-2.7
>>>>> count-median.py           find-knots.py            python
>>>>> count-overlap.py          interleave-reads.py
>>>>> sample-reads-randomly.py
>>>>> do-partition.py           load-graph.py
>>>>> split-paired-reads.py
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Jan 28, 2014 at 5:19 PM, Michael R. Crusoe <mcrusoe at msu.edu>wrote:
>>>>>
>>>>>> Thank you Dan for your email.
>>>>>>
>>>>>> pip, the tool we use for installing khmer and its dependencies, has
>>>>>> gone decided to break backwards compatibility with their version 1.5
>>>>>> release.
>>>>>>
>>>>>> Looks like you're using an earlier version which dies if you pass it
>>>>>> an argument it doesn't understand. I will update the docs to clarify the
>>>>>> situation.
>>>>>>
>>>>>> If your pip version is prior to 1.5 then install using `pip install
>>>>>> khmer`.
>>>>>>
>>>>>> If you're using pip version 1.5 or later then use: `pip install
>>>>>> --allow-external argparse khmer`
>>>>>>
>>>>>> I will also be updating the instructions on how to run the tests.
>>>>>> Right now the easiest way is to pass the '--no-clean' flag and to go into
>>>>>> the env/build/khmer directory where you can run `make tests`.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Jan 28, 2014 at 3:26 PM, Daniel Burkhardt <
>>>>>> burkhardt.d.b at gmail.com> wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> I wasn't sure if I should email this address or the help listserv,
>>>>>>> but I figured it would be better to avoid emailing a bunch of people with a
>>>>>>> potentially novice question. My name is Daniel Burkhardt and I'm a research
>>>>>>> assistant fellow in the lab of Dr. Kristen DeAngelis at the University of
>>>>>>> Massachusetts, Amherst.
>>>>>>>
>>>>>>> The short question:
>>>>>>>
>>>>>>> I followed the instructions on the http://khmer.readthedocs.org/
>>>>>>> en/latest/install.html page, but I have no env/src directory. So I
>>>>>>> can't do the last step that says to run the tests under src/tests. I also
>>>>>>> do not have any of the sample files to try. I've done a system-wide search,
>>>>>>> but have no files with the pattern "stamps-reads" anywhere. My question is:
>>>>>>> what did I do wrong?
>>>>>>>
>>>>>>> Some background:
>>>>>>>
>>>>>>
>>>>>> <snip for privacy>
>>>>>>
>>>>>>
>>>>>>> I am running Ubuntu 12.04. I followed the instructions on the read
>>>>>>> the docs up through:
>>>>>>>
>>>>>>> pip install --allow-external argparse khmer
>>>>>>>
>>>>>>>
>>>>>>> but I can't find a src directory under my env directory. Also it
>>>>>>> seems that the --allow-external flag is outdated, and is no longer a valid
>>>>>>> option for pip. Attached is a .log file of pip's output. All of the scripts
>>>>>>> seem to have installed correctly, but I can't figure out why I have no
>>>>>>> tests directory or any sample files. Is this problem encountered often? I
>>>>>>> couldn't find any similar problem reports on the listserv archives. Did I
>>>>>>> miss something rudimentary? I would appreciate any insight you can offer.
>>>>>>>
>>>>>>> Thank you for your time,
>>>>>>> Dan
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Michael R. Crusoe:  Programmer & Bioinformatician   mcrusoe at msu.edu
>>>>>>  @ the Genomics, Evolution, and Development lab; Michigan State U
>>>>>> http://ged.msu.edu/ http://orcid.org/0000-0002-2961-9670 @biocrusoe<http://twitter.com/biocrusoe>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Daniel Burkhardt
>>>>> University of Massachusetts Amherst - Microbiology
>>>>> Cell: (978)-846-2197
>>>>>
>>>>
>>>
>>>
>>> --
>>> Daniel Burkhardt
>>> University of Massachusetts Amherst - Microbiology
>>> Cell: (978)-846-2197
>>>
>>
>
>
> --
> Daniel Burkhardt
> University of Massachusetts Amherst - Microbiology
> Cell: (978)-846-2197
>



-- 
Michael R. Crusoe:  Programmer & Bioinformatician   mcrusoe at msu.edu
 @ the Genomics, Evolution, and Development lab; Michigan State U
http://ged.msu.edu/ http://orcid.org/0000-0002-2961-9670
@biocrusoe<http://twitter.com/biocrusoe>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/khmer/attachments/20140218/6325fe5d/attachment-0002.htm>


More information about the khmer mailing list