<div dir="ltr">Hi Alexis and <span style="font-family:arial,sans-serif;font-size:13px">Louise-Amélie,</span><div><span style="font-family:arial,sans-serif;font-size:13px"><br></span></div><div style><span style="font-family:arial,sans-serif;font-size:13px">Thank you both for the information. I am trying to reproduce your problem with a large data set right now.</span></div>
<div style><span style="font-family:arial,sans-serif;font-size:13px">I agree that the problem may be a function of the amount of data. However, if you were running out of memory, then I would expect to see a segmentation fault rather than a FPE. I am still guessing this problem may be threading-related (even if the number of workers is reduced to 1, there is still the master thread which supplies the groups of sequences and the writer thread which outputs the kept sequences). But, my guesses have not proved to be that useful with your problem thus far, so take my latest guess with a grain of salt. :-)</span></div>
<div style><span style="font-family:arial,sans-serif;font-size:13px"><br></span></div><div style><span style="font-family:arial,sans-serif;font-size:13px">Depending on whether I am able to reproduce the problem, I have some more ideas which I intend to try tomorrow. If you find anything else interesting, I would like to know. But, I feel bad about how much time you have wasted on this. Hopefully I will be able to reproduce the problem....</span></div>
<div style><span style="font-family:arial,sans-serif;font-size:13px"><br></span></div><div style><span style="font-family:arial,sans-serif;font-size:13px">Thanks,</span></div><div style><span style="font-family:arial,sans-serif;font-size:13px"> Eric</span></div>
<div style><span style="font-family:arial,sans-serif;font-size:13px"><br></span></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Mar 14, 2013 at 1:32 PM, Alexis Groppi <span dir="ltr"><<a href="mailto:alexis.groppi@u-bordeaux2.fr" target="_blank">alexis.groppi@u-bordeaux2.fr</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
Hi Eric,<br>
<br>
Interesting result : <br>
To test on a computer with python 2.7 we have done small sub samples
(here attached)<br>
We first generated the <a href="http://sample.kh" target="_blank">sample.kh</a> (~2Go to big to be send by email)
with load-into-counting.py<br>
and then filter-below-abund.py on <a href="http://sample.kh" target="_blank">sample.kh</a><br>
It works !<br>
<br>
Then I try again on the machine with python 2.6 (the same as before)
and : It works !!! :)<br>
<br>
==> problem related to the amount of data ?<br>
<br>
Other ideas ?<br>
<br>
Thanks again<br>
<br>
Alexis<br>
<br>
<br>
<br>
<div>Le 14/03/2013 17:29, Eric McDonald a
écrit :<br>
</div><div><div class="h5">
<blockquote type="cite">
<div dir="ltr">Thank you, Alexis.
<div><br>
</div>
<div>This is certainly a very interesting problem. :-/
</div>
<div><br>
</div>
<div>If you have the opportunity, could you try the
latest Python 2.7? I see that you are using Python 2.6.6. I
briefly looked in the Python bugs database and didn't see any
evidence that a newer version would help, but I would like to
try excluding the possibility of a Python bug. </div>
<div><br>
</div>
<div>If a newer Python does not help, then I am going
to assume a subtle bug in 'khmer' and will try harder to
reproduce it on my end. Again, I really appreciate your great
cooperation; thank you for running so many experiments.</div>
<div><br>
</div>
<div>Eric</div>
<div><br>
</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Thu, Mar 14, 2013 at 11:00 AM,
Alexis Groppi <span dir="ltr"><<a href="mailto:alexis.groppi@u-bordeaux2.fr" target="_blank">alexis.groppi@u-bordeaux2.fr</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"> Hi Eric,<br>
<br>
Here are the bash and the output/error file after last
modifications we've done upon your advice<br>
<br>
Still crashing :/<br>
<br>
Alexis<br>
<br>
<div>Le 14/03/2013 13:56, Eric McDonald a écrit :<br>
</div>
<div>
<div>
<blockquote type="cite">
<div dir="ltr">Alexis,
<div><br>
</div>
<div>Sorry, I didn't mean to imply anything bad
about David. As someone who has previously
worked as a HPC systems administrator, I know
that I have felt annoyed when users fill the
wrong file system. So, if he was annoyed, then I
understand.</div>
<div><br>
</div>
<div>I just realized that we didn't check
something very basic.... What is your PYTHONPATH
environment variable set to? What is the result,
if you add:</div>
<div> echo $PYTHONPATH</div>
<div>before the commands in the script?</div>
<div><br>
</div>
<div>Also, did you install 'khmer' into your
virtualenv (i.e., did you do "python setup.py
install" at some point after you had done ".
/mnt/var/home/ag/env/bin/activate")? If so, then
we have likely been using the wrong 'khmer'
modules this whole time... To verify this, what
does the following command tell you:</div>
<div> python -c "import khmer.thread_utils as tu;
print tu.__file__"</div>
<div>We want to use the Python modules under your
'khmer-BETA/python' directory and not the ones
under your
'/mnt/var/home/ag/env/lib/python2.7/site-packages'
directory. I should have asked you to check this
much earlier in the debugging process,
especially since I was helping someone else with
a similar issue.</div>
<div> </div>
<div class="gmail_extra">Thank you,</div>
<div class="gmail_extra"> Eric<br>
<br>
<div class="gmail_quote">On Thu, Mar 14, 2013 at
7:28 AM, Alexis Groppi <span dir="ltr"><<a href="mailto:alexis.groppi@u-bordeaux2.fr" target="_blank">alexis.groppi@u-bordeaux2.fr</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"> Hi
Eric,<br>
<br>
<br>
<div>Le 14/03/2013 11:47, Eric McDonald a
écrit :<br>
</div>
<blockquote type="cite">
<div dir="ltr">Hi Alexis,
<div><br>
</div>
<div>
<div>The 'coredump' file comes from
a standard Unix feature - it is
simply the image of the Python
process as it was in memory at the
time of crash. This is nothing
that any 'khmer' script produces
explicitly. This is made by your
operating system.</div>
<div><br>
</div>
<div>You should be able to disable
the core dumps if your systems
engineer is getting upset about
the space they are using. Please
add the following before other
commands in your job script:</div>
<div> ulimit -c 0</div>
</div>
</div>
</blockquote>
Done, but for some mysterious reason, its
not taken in account...<br>
But David is a cool guy ;)
<div><br>
<blockquote type="cite">
<div dir="ltr">
<div><br>
</div>
<div>(Note: we could actually use
the core dumps for debugging. I
refrained from suggesting this to
you yesterday, since describing
the process can be somewhat
complicated.)</div>
<div><br>
</div>
<div>Anyway, thanks for rerunning
with the diagnostics I suggested.
The exit code is 136, which is
what you get if a process
experiences a floating-point
exception. If the exception had
occurred within
'filter-below-abund.py' proper,
the exit code would've been 1
rather than 136. This is what I
wanted to double-check.</div>
<div><br>
</div>
<div>I suspect that something bad is
happening within the Python
interpreter. This may be due to
some subtle bug involving Python's
global interpreter lock and
'khmer' not doing something proper
with regards to that. I will
attempt to analyze the problem in
more detail later today.</div>
<div><br>
</div>
<div>I know you must be getting
tired of working on this problem</div>
</div>
</blockquote>
<br>
</div>
Not at all. Thanks for your work !
<div><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div>, but if you want to try one
more thing (for now), then it
would be appreciated. Could you
edit your copy of
'filter-below-abund.py' and
change:</div>
<div> WORKER_THREADS = 8</div>
<div>to:</div>
<div> WORKER_THREADS = 1</div>
<div>and see if that helps?</div>
</div>
</blockquote>
</div>
Done also... but unfortunately same result
:(<br>
<br>
<br>
Alexis
<div>
<div><br>
<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div><br>
</div>
<div>Thanks,</div>
<div> Eric</div>
<div><br>
</div>
<div>P.S. If you want to reduce
memory usage, you can decrease
the Bloom filter size by
adjusting the "-x" parameter
that you use in the scripts to
create your .kh files. Making
this number smaller will reduce
memory usage but will also
increase the false positive
rate, so be careful about tuning
this too much.</div>
<div><br>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Thu,
Mar 14, 2013 at 5:42 AM,
Alexis Groppi <span dir="ltr"><<a href="mailto:alexis.groppi@u-bordeaux2.fr" target="_blank">alexis.groppi@u-bordeaux2.fr</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"> Hi
Eric,<br>
<br>
I've tried all the
suggestions you made<br>
But same result (see
attached e/o file)<br>
<br>
But with the help of David
(the system engineer of
the lab) I think we have
found the bug : <br>
==>
filter-below-abund.py
fills a directory ( <i><span>/</span>var/spool/abrt/ccpp-<a href="tel:2013-03-14-10" value="+12013031410" target="_blank">2013-03-14-10</a>\:24\:13-26642.new<span>/</span></i>
coredump/) until it
reaches all the available
space. (see below)<br>
==> Then it crashes<br>
<br>
Is there a way to modify
this ?<br>
<br>
Thanks again<br>
<br>
Alexis<br>
**************************************************<br>
<tt>[root@rainman ~]# ll
-h </tt><tt><i><span>/</span>var/spool/abrt/ccpp-<a href="tel:2013-03-14-10" value="+12013031410" target="_blank">2013-03-14-10</a>\:24\:13-26642.new<span>/</span></i></tt><tt>
</tt><tt><br>
</tt><tt>total 12G </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 4 14 mars
10:24 analyzer </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 6 14 mars
10:24 architecture </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 150 14 mars
10:24 cmdline </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 12G 14 mars
10:24 coredump </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 1,5K 14 mars
10:24 environ </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 31 14 mars
10:24 executable </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 27 14 mars
10:24 hostname </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 26 14 mars
10:24 kernel </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 13K 14 mars
10:24 maps </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 26 14 mars
10:24 os_release </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 71 14 mars
10:24 reason </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 10 14 mars
10:24 time </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 3 14 mars
10:24 uid </tt><tt><br>
</tt><tt> </tt><tt><br>
</tt><tt> </tt><tt><br>
</tt><tt>[root@rainman ~]#
ll -h </tt><tt><i><span>/</span>var/spool/abrt/ccpp-<a href="tel:2013-03-14-10" value="+12013031410" target="_blank">2013-03-14-10</a>\:24\:13-26642.new<span>/</span></i></tt><tt>
</tt><tt><br>
</tt><tt>total 18G </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 4 14 mars
10:24 analyzer </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 6 14 mars
10:24 architecture </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 150 14 mars
10:24 cmdline </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 18G 14 mars
10:25 coredump </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 1,5K 14 mars
10:24 environ </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 31 14 mars
10:24 executable </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 27 14 mars
10:24 hostname </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 26 14 mars
10:24 kernel </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 13K 14 mars
10:24 maps </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 26 14 mars
10:24 os_release </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 71 14 mars
10:24 reason </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 10 14 mars
10:24 time </tt><tt><br>
</tt><tt>-rw-r----- 1 abrt
users 3 14 mars
10:24 uid </tt><tt><br>
</tt><br>
<br>
<div>Le 13/03/2013 22:58,
Eric McDonald a écrit :<br>
</div>
<div>
<div>
<blockquote type="cite">
<div dir="ltr">Forwarding
my earlier reply
to the list, since
I didn't
reply-to-all
earlier.
<div><br>
</div>
<div>Also, Alexis,
you may wish to
change the
following in
your job script:</div>
<div> #PBS -l
nodes=1:ppn=1</div>
<div>to</div>
<div> #PBS -l
nodes=1:ppn=8</div>
<div>assuming that
you have 8-core
nodes available.
'filter-below-abund.py'
uses 8 threads
by default; if a
'khmer' job runs
on the same node
as another job,
it may try using
more CPU cores
than it was
allocated and
that could
create problems
with your
systems
administrators.
And, if a job's
threads are
restricted to
the requested
number of cores,
then you will
also not be
getting optimal
performance by
using more
threads (8) than
available cores
(1).<br>
<br>
<div class="gmail_quote">----------
Forwarded
message
----------<br>
From: <b class="gmail_sendername">Eric
McDonald</b> <span dir="ltr"><<a href="mailto:emcd.msu@gmail.com" target="_blank">emcd.msu@gmail.com</a>></span><br>
Date: Wed, Mar
13, 2013 at
3:12 PM<br>
Subject: Re:
[khmer] How to
speed up the
filter-below-abund
script ?<br>
To: <a href="mailto:alexis.groppi@u-bordeaux2.fr" target="_blank">alexis.groppi@u-bordeaux2.fr</a><br>
<br>
<br>
<div dir="ltr">Alexis,
<div><br>
</div>
<div>I just
realized that
the
floating-point
exception is
from inside
the Python
interpreter
itself. If the
floating-point
exception had
appeared from
within the
'filter-below-abund.py'
script, then
we shoul have
seen a
traceback from
the exception,
ending with:</div>
<div> ZeroDivisionError:
float division
by zero</div>
<div>Instead,
we are seeing:</div>
<div>
<div> <span style="font-family:monospace">line
49: 54757
Floating point
exception(core
dumped)</span></div>
</div>
<div> from
your job
shell. (I
should've
noticed that
earlier.)</div>
<div><br>
<div class="gmail_extra">Would
you please add
the following
lines to your
job script
somewhere
before you
invoke
'filter-below-abund.py':</div>
<div class="gmail_extra">
python
--version</div>
<div class="gmail_extra">
which python</div>
<div class="gmail_extra"><br>
</div>
<div class="gmail_extra">And
would you
please add the
following line
_immediately
after_ you
invoke
'filter-below-abund.py':</div>
<div class="gmail_extra">
echo "Exit
Code: $?"</div>
<div class="gmail_extra"><br>
</div>
<div class="gmail_extra">Also,
would you
remove the
'time' command
from in front
of your
invocation of
'filter-below-abund.py'?</div>
<div class="gmail_extra"><br>
</div>
<div class="gmail_extra">
And, one more
action before
trying
again...
please run:</div>
<div class="gmail_extra">
git pull</div>
<div class="gmail_extra">in
your
'khmer-BETA'
directory. (I
added another
possible fix
to the
'bleeding-edge'
branch. This
command will
pull that fix
into your
clone.)</div>
<div class="gmail_extra"><br>
</div>
<div class="gmail_extra">Thank
you,</div>
<div class="gmail_extra">
Eric
<div>
<div><br>
<br>
<div class="gmail_quote">On
Wed, Mar 13,
2013 at 10:13
AM, Alexis
Groppi <span dir="ltr"><<a href="mailto:alexis.groppi@u-bordeaux2.fr" target="_blank">alexis.groppi@u-bordeaux2.fr</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"> Hi,<br>
<br>
<div>Le
13/03/2013
14:12, Eric
McDonald a
écrit :<br>
</div>
<blockquote type="cite">
<div dir="ltr">Hi
Alexis,
<div><br>
</div>
<div>
<div>First,
let me say
thank you for
being patient
and working
with us in
spite of all
the problems
you are
encountering.</div>
</div>
</div>
</blockquote>
<br>
That's
bioinformatician
life ;)
<div><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div><br>
</div>
<div>With
regards to the
floating point
exception, I
see several
opportunities
for a
division-by-zero
condition in
the threading
utilities used
by the script.
These
opportunities
exist if an
input file is
empty. (The
problem may be
coming from
another place,
but this would
be my first
guess.) What
does the
following
command say:</div>
<div><br>
</div>
<div> ls -lh <span style="font-family:monospace;font-size:10px">/scratch/ag/khmer/</span><a href="http://174r1_table.kh/" style="font-family:monospace;font-size:10px" target="_blank">174r1_table.kh</a> <span style="font-family:monospace;font-size:10px">/mnt/var/home/ag/174r1_</span><span style="font-family:monospace;font-size:10px">prinseq_good_bFr8.fasta.keep</span></div>
</div>
</blockquote>
<br>
</div>
The result :
(the files are
not empty)<br>
<tt>-rw-r--r--
1 ag users
299M 12 mars
20:54
/mnt/var/home/ag/174r1_prinseq_good_bFr8.fasta.keep</tt><tt><br>
</tt><tt>-rw-r--r--
1 ag users
141G 12 mars
21:05
/scratch/ag/khmer/<a href="http://174r1_table.kh" target="_blank">174r1_table.kh</a></tt>
<div><br>
<br>
<blockquote type="cite">
<div dir="ltr"><br>
<div>Also,
since you
appear to be
using TORQUE
as your
resource
manager/batch
system, could
you please
attach the
complete
output and
error files
for the job?
(These files
should be of
the form
<job_name>.o2693
and
<job_name>.e2693,
where
<job_name>
is the name of
your job.
There may only
be one or the
other of these
files,
depending on
site defaults
and whether
you specified
"-j oe" or "-j
eo" in your
job
submission.)<br>
</div>
</div>
</blockquote>
<br>
</div>
I re run the
job since I
have deleted
previous
(2693) err/out
files.<br>
Here is the
new file
(merged with
the option -j
oe in the bash
script) :<br>
<br>
<tt>#############################</tt><tt><br>
</tt><tt>User:
ag</tt><tt><br>
</tt><tt>Date:
Wed Mar 13
14:59:21 CET
2013</tt><tt><br>
</tt><tt>Host:
<a href="http://rainman.cbib.u-bordeaux2.fr" target="_blank">rainman.cbib.u-bordeaux2.fr</a></tt><tt><br>
</tt><tt>Directory:
/mnt/var/home/ag</tt><tt><br>
</tt><tt>PBS_JOBID:
2695.rainman</tt><tt><br>
</tt><tt>PBS_O_WORKDIR:
/mnt/var/home/ag</tt><tt><br>
</tt><tt>PBS_NODEFILE:
rainman</tt><tt><br>
</tt><tt>#############################</tt><tt><br>
</tt><tt>#############################</tt><tt><br>
</tt><tt>Debut
filter-below-abund:
Wed Mar 13
14:59:21 CET
2013</tt>
<div><tt><br>
</tt><tt>starting
threads</tt><tt><br>
</tt><tt>starting
writer</tt><tt><br>
</tt><tt>loading...</tt><tt><br>
</tt><tt>...
filtering 0</tt><tt><br>
</tt></div>
<tt>/var/lib/torque/mom_priv/jobs/<a href="http://2695.rainman.SC" target="_blank">2695.rainman.SC</a>:
line 49: 54757
Floating point
exception(core
dumped)
./khmer-BETA/sandbox/fi</tt><tt><br>
</tt><tt>lter-below-abund.py
/scratch/ag/khmer/<a href="http://174r1_table.kh" target="_blank">174r1_table.kh</a>
/mnt/var/home/ag/174r1_prinseq_good_bFr8.fasta.keep</tt><tt><br>
</tt><tt><br>
</tt><tt>real
3m54.873s</tt><tt><br>
</tt><tt>user
0m0.085s</tt><tt><br>
</tt><tt>sys
2m2.180s</tt><tt><br>
</tt><tt>Date
fin: Wed Mar
13 15:03:15
CET 2013</tt><tt><br>
</tt><tt>Job
finished</tt><br>
<br>
Thanks again
for your help
:)<br>
<br>
Alexis
<div>
<div><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div> </div>
<div><br>
</div>
<div>Thanks,</div>
<div> Eric</div>
<div><span style="font-family:monospace;font-size:10px"><br>
</span></div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On
Wed, Mar 13,
2013 at 5:38
AM, Alexis
Groppi <span dir="ltr"><<a href="mailto:alexis.groppi@u-bordeaux2.fr" target="_blank">alexis.groppi@u-bordeaux2.fr</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"> Hi Eric,<br>
<br>
Thanks for
your answer.<br>
But
unfortunately,
after many
attempts I'm
getting this
error :<tt><br>
<br>
</tt><tt>starting
threads</tt><tt><br>
</tt><tt>starting
writer</tt><tt><br>
</tt><tt>loading...</tt><tt><br>
</tt><tt>...
filtering 0</tt><tt><br>
</tt><tt>/var/lib/torque/mom_priv/jobs/<a href="http://2693.rainman.SC" target="_blank">2693.rainman.SC</a>:
line 46: 63657
Floating point
exception(core
dumped)
./khmer-BETA/sandbox/filter-below-abund.py
/scratch/ag/khmer/<a href="http://174r1_table.kh" target="_blank">174r1_table.kh</a>
/mnt/var/home/ag/174r1_prinseq_good_bFr8.fasta.keep</tt><tt><br>
</tt><tt><br>
</tt><tt>real
3m30.163s</tt><tt><br>
</tt><tt>user
0m0.088s</tt><br>
<br>
Your opinion ?<br>
<br>
Thanks<br>
<br>
Alexis<br>
<br>
<br>
<div>Le
13/03/2013
00:55, Eric
McDonald a
écrit :<br>
</div>
<div>
<div>
<blockquote type="cite">
<div dir="ltr">Hi
Alexis,
<div><br>
</div>
<div>One way
to get the
'bleeding-edge'
branch is to
clone it into
a fresh
directory; for
example:</div>
<div> git
clone <a href="http://github.com/ged-lab/khmer.git" target="_blank">http://github.com/ged-lab/khmer.git</a>
-b
bleeding-edge
khmer-BETA</div>
<div><br>
</div>
<div>Assuming
you already
have a clone
of the
'ged-lab/khmer'
repo, then you
should also be
able to do:</div>
<div> git
fetch origin</div>
<div> git
checkout
bleeding-edge</div>
<div>Depending
on how old
your Git
client is and
what its
defaults are,
you may have
to do the
following
instead:</div>
<div> git
checkout
--track -b
bleeding-edge
origin/bleeding-edge</div>
<div><br>
</div>
<div>Hope this
helps,</div>
<div> Eric</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On
Tue, Mar 12,
2013 at 11:32
AM, Alexis
Groppi <span dir="ltr"><<a href="mailto:alexis.groppi@u-bordeaux2.fr" target="_blank">alexis.groppi@u-bordeaux2.fr</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"> <br>
<div>Le
12/03/2013
16:16, C.
Titus Brown a
écrit :<br>
</div>
<div>
<blockquote type="cite">
<pre>On Tue, Mar 12, 2013 at 04:15:05PM +0100, Alexis Groppi wrote:
</pre>
<blockquote type="cite">
<pre>Hi Titus,
Thanks for your answer
Actually it's my second attempt with filter-below-abund.
The first time, I thought the problem was coming from the location of my
<a href="http://table.kh" target="_blank">table.kh</a> file : in a storage element with poor level performance of I/O
I killed the job after 24h, moved the file in a best place and re run it
But with the same result : no completion after 24h
Any Idea ?
Thanks
Cheers From Bordeaux :)
Alexis
PS : The command line was the following :
./filter-below-abund.py <a href="http://174r1_table.kh" target="_blank">174r1_table.kh</a> 174r1_prinseq_good_bFr8.fasta.keep
Is this correct ?
</pre>
</blockquote>
<pre>Yes, looks right... Can you try with the bleeding-edge branch, which now
incorporates a potential fix for this issue?</pre>
</blockquote>
</div>
From here : <a href="https://github.com/ged-lab/khmer/tree/bleeding-edge" target="_blank">https://github.com/ged-lab/khmer/tree/bleeding-edge</a>
?<br>
or <br>
here : <a href="https://github.com/ctb/khmer/tree/bleeding-edge" target="_blank">https://github.com/ctb/khmer/tree/bleeding-edge</a>
?<br>
<br>
Do I have to
make a fresh
install ? and
How ?<br>
Or just
replace all
the files and
folders ?<br>
<br>
Thanks :)<br>
<br>
Alexis
<div>
<div><br>
<br>
<blockquote type="cite">
<pre>thanks,
--titus
</pre>
<blockquote type="cite">
<pre>Le 12/03/2013 14:41, C. Titus Brown a ?crit :
</pre>
<blockquote type="cite">
<pre>On Tue, Mar 12, 2013 at 10:48:03AM +0100, Alexis Groppi wrote:
</pre>
<blockquote type="cite">
<pre>Metagenome assembly :
My data :
- original (quality filtered) data : 4463243 reads (75 nt) (Illumina)
1/ Single pass digital normalization with normalize-by-median (C=20)
==> file .keep of 2560557 reads
2/ generated a hash table by load-into-counting on the .keep file
==> file .kh of ~16Go (huge file ?!)
3/ filter-below-abund with C=100 from the two previous file (<a href="http://table.kh" target="_blank">table.kh</a>
and reads.keep)
Still running after 24 hours :(
Any advice to speed up this step ? ... and the others (partitionning ...) ?
I can have an access to a HPC : ~3000 cores.
</pre>
</blockquote>
<pre>Hi Alexis,
filter-below-abund and filter-abund have occasional bugs that prevent them
from completing. I would kill and restart. For that few reads it should
take no more than a few hours to do everything.
Most of what khmer does cannot easily be distributed across multiple chassis,
note.
best,
--titus
</pre>
</blockquote>
<pre>--
</pre>
</blockquote>
</blockquote>
<br>
</div>
</div>
<span><font color="#888888">
<div>-- <br>
<img src="cid:part25.06010901.08020701@u-bordeaux2.fr" border="0"></div>
</font></span></div>
<br>
_______________________________________________<br>
khmer mailing
list<br>
<a href="mailto:khmer@lists.idyll.org" target="_blank">khmer@lists.idyll.org</a><br>
<a href="http://lists.idyll.org/listinfo/khmer" target="_blank">http://lists.idyll.org/listinfo/khmer</a><br>
<br>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr">
<div>Eric
McDonald</div>
<div>HPC/Cloud
Software
Engineer</div>
<div> for the
Institute for
Cyber-Enabled
Research
(iCER)</div>
<div> and the
Laboratory for
Genomics,
Evolution, and
Development
(GED)</div>
<div>Michigan
State
University</div>
<div>P: <a href="tel:517-355-8733" value="+15173558733" target="_blank">517-355-8733</a></div>
</div>
</div>
</blockquote>
<br>
</div>
</div>
<span><font color="#888888">
<div>-- <br>
<img src="cid:part29.04010909.07020009@u-bordeaux2.fr" border="0"></div>
</font></span></div>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr">
<div>Eric
McDonald</div>
<div>HPC/Cloud
Software
Engineer</div>
<div> for the
Institute for
Cyber-Enabled
Research
(iCER)</div>
<div> and the
Laboratory for
Genomics,
Evolution, and
Development
(GED)</div>
<div>Michigan
State
University</div>
<div>P: <a href="tel:517-355-8733" value="+15173558733" target="_blank">517-355-8733</a></div>
</div>
</div>
</blockquote>
<br>
</div>
</div>
<span><font color="#888888">
<div>-- <br>
<img src="cid:part31.07070401.06000903@u-bordeaux2.fr" border="0"></div>
</font></span></div>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr">
<div>Eric
McDonald</div>
<div>HPC/Cloud
Software
Engineer</div>
<div> for the
Institute for
Cyber-Enabled
Research
(iCER)</div>
<div> and the
Laboratory for
Genomics,
Evolution, and
Development
(GED)</div>
<div>Michigan
State
University</div>
<div>P: <a href="tel:517-355-8733" value="+15173558733" target="_blank">517-355-8733</a></div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr">
<div>Eric
McDonald</div>
<div>HPC/Cloud
Software
Engineer</div>
<div> for the
Institute for
Cyber-Enabled
Research
(iCER)</div>
<div> and the
Laboratory for
Genomics,
Evolution, and
Development
(GED)</div>
<div>Michigan
State
University</div>
<div>P: <a href="tel:517-355-8733" value="+15173558733" target="_blank">517-355-8733</a></div>
</div>
</div>
</div>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
khmer mailing list
<a href="mailto:khmer@lists.idyll.org" target="_blank">khmer@lists.idyll.org</a>
<a href="http://lists.idyll.org/listinfo/khmer" target="_blank">http://lists.idyll.org/listinfo/khmer</a>
</pre>
</blockquote>
<br>
</div>
</div>
<span><font color="#888888">
<div>-- <br>
<img src="cid:part36.08050500.09050801@u-bordeaux2.fr" border="0"></div>
</font></span></div>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr">
<div>Eric McDonald</div>
<div>HPC/Cloud Software
Engineer</div>
<div> for the Institute for
Cyber-Enabled Research
(iCER)</div>
<div> and the Laboratory for
Genomics, Evolution, and
Development (GED)</div>
<div>Michigan State University</div>
<div>P: <a href="tel:517-355-8733" value="+15173558733" target="_blank">517-355-8733</a></div>
</div>
</div>
</div>
</blockquote>
<br>
</div>
</div>
<span><font color="#888888">
<div>-- <br>
<img src="cid:part38.09040603.00030500@u-bordeaux2.fr" border="0"></div>
</font></span></div>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr">
<div>Eric McDonald</div>
<div>HPC/Cloud Software Engineer</div>
<div> for the Institute for Cyber-Enabled
Research (iCER)</div>
<div> and the Laboratory for Genomics,
Evolution, and Development (GED)</div>
<div>Michigan State University</div>
<div>P: <a href="tel:517-355-8733" value="+15173558733" target="_blank">517-355-8733</a></div>
</div>
</div>
</div>
</blockquote>
<br>
</div>
</div>
<span><font color="#888888">
<div>-- <br>
<img src="cid:part40.06070206.09060701@u-bordeaux2.fr" border="0"></div>
</font></span></div>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr">
<div>Eric McDonald</div>
<div>HPC/Cloud Software Engineer</div>
<div> for the Institute for Cyber-Enabled Research (iCER)</div>
<div> and the Laboratory for Genomics, Evolution, and
Development (GED)</div>
<div>Michigan State University</div>
<div>P: <a href="tel:517-355-8733" value="+15173558733" target="_blank">517-355-8733</a></div>
</div>
</div>
</blockquote>
<br>
</div></div><span class="HOEnZb"><font color="#888888"><div>-- <br>
<img src="cid:part41.00090606.02070103@u-bordeaux2.fr" border="0"></div>
</font></span></div>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><div dir="ltr"><div>Eric McDonald</div><div>HPC/Cloud Software Engineer</div><div> for the Institute for Cyber-Enabled Research (iCER)</div><div> and the Laboratory for Genomics, Evolution, and Development (GED)</div>
<div>Michigan State University</div><div>P: 517-355-8733</div></div>
</div>