[khmer] How to speed up the filter-below-abund script ?

Alexis Groppi alexis.groppi at u-bordeaux2.fr
Tue Mar 12 08:32:44 PDT 2013


Le 12/03/2013 16:16, C. Titus Brown a écrit :
> On Tue, Mar 12, 2013 at 04:15:05PM +0100, Alexis Groppi wrote:
>> Hi Titus,
>>
>> Thanks for your answer
>> Actually it's my second attempt with filter-below-abund.
>> The first time, I thought the problem was coming from the location of my
>> table.kh file : in a storage element with poor level performance of I/O
>> I killed the job after 24h, moved the file in a best place and re run it
>> But with the same result : no completion after 24h
>>
>> Any Idea ?
>>
>> Thanks
>>
>> Cheers From Bordeaux :)
>>
>> Alexis
>>
>> PS : The command line was the following :
>>
>> ./filter-below-abund.py 174r1_table.kh 174r1_prinseq_good_bFr8.fasta.keep
>>
>> Is this correct ?
> Yes, looks right... Can you try with the bleeding-edge branch, which now
> incorporates a potential fix for this issue?
 From here : https://github.com/ged-lab/khmer/tree/bleeding-edge ?
or
here : https://github.com/ctb/khmer/tree/bleeding-edge ?

Do I have to make a fresh install ? and How  ?
Or just replace all the files and folders ?

Thanks :)

Alexis

>
> thanks,
> --titus
>
>> Le 12/03/2013 14:41, C. Titus Brown a ?crit :
>>> On Tue, Mar 12, 2013 at 10:48:03AM +0100, Alexis Groppi wrote:
>>>> Metagenome assembly :
>>>> My data :
>>>> - original (quality filtered) data : 4463243 reads (75 nt) (Illumina)
>>>> 1/ Single pass digital normalization with normalize-by-median (C=20)
>>>> ==> file .keep of 2560557 reads
>>>> 2/ generated a hash table by load-into-counting on the .keep file
>>>> ==> file .kh of ~16Go (huge file ?!)
>>>> 3/ filter-below-abund with C=100 from the two previous file (table.kh
>>>> and reads.keep)
>>>> Still running after 24 hours  :(
>>>>
>>>> Any advice to speed up this step ? ... and the others (partitionning ...) ?
>>>>
>>>> I can have an access to a HPC : ~3000 cores.
>>> Hi Alexis,
>>>
>>> filter-below-abund and filter-abund have occasional bugs that prevent them
>>> from completing.  I would kill and restart.  For that few reads it should
>>> take no more than a few hours to do everything.
>>>
>>> Most of what khmer does cannot easily be distributed across multiple chassis,
>>> note.
>>>
>>> best,
>>> --titus
>> -- 

-- 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/khmer/attachments/20130312/b2c39a87/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Signature_Mail_A_Groppi.png
Type: image/png
Size: 29033 bytes
Desc: not available
URL: <http://lists.idyll.org/pipermail/khmer/attachments/20130312/b2c39a87/attachment-0002.png>


More information about the khmer mailing list