[khmer] parallelizing reading

Peio Ziarsolo pziarsolo at upv.es
Mon Jul 29 05:40:13 PDT 2013


Thanks Titus
p.

al., 2013.eko uztren 29a 14:19(e)an, C. Titus Brown(e)k idatzi zuen:
> On Mon, Jul 29, 2013 at 05:10:45AM -0700, C. Titus Brown wrote:
>> On Mon, Jul 29, 2013 at 10:57:06AM +0200, Peio Ziarsolo wrote:
>>> I have seen that ReadParser can be parallelized and I would like to use
>>> it to  parallelize the reading of sequence files
>>>
>>> I am trying to use it but I don't know how. I have made a small script
>>> to test the function:
>>>
>>> from khmer import ReadParser
>>> for i in ReadParser('/home/peio/work_in/bug_parallel/big.fastq', 2):
>>>      print i.name
>>>
>>> But I am not able to make it finish. If I use just one thread it
>>> finishes as it should.
>>>
>>> What am I doing wrong? I am using bleeding-edge branch.
>>>
>>> Thanks in advance
>>> Peio Ziarsolo
>> Hi Peio,
>>
>> there is some example code referenced in here --
>>
>> http://ivory.idyll.org/blog/multithreaded-read-parsing-in-khmer.html
>>
>> that should work.  Just grab the code from here,
>>
>> https://gist.github.com/ctb/5328016
>>
>> and go to town.  Briefly, you need to manage your own threading in Python,
>> but when you do, ReadParser will support it.
> p.s. You can also look at many of the scripts distributed with khmer,
> e.g. load-into-counting, but they're more complicated than that test
> script.
>
> https://github.com/ged-lab/khmer/blob/bleeding-edge/scripts/load-into-counting.py
>
> cheers,
> --titus





More information about the khmer mailing list