[bip] replacing those shell scripts ... the PipeChain

Giovanni Marco Dall'Olio dalloliogm at gmail.com
Tue May 12 01:39:02 PDT 2009


On Thu, May 7, 2009 at 3:46 PM, James Casbon <casbon at gmail.com> wrote:
> So I often want to use the unix toolchain to get a quick idea of some
> data.  You know, grep this, cut that, sort, etc.  I often find myself
> thinking that I won't use python because writing all those Popen(blah,
> stdout=PIPE) is just too verbose.  However, this means that the
> migration from shell script to python script has an annoying little
> little bump.

Do you mind me a bit of self-spam?
Have a look at makefiles. Here it is a slideshow I wrote some time ago:
- http://bioinfoblog.it/2009/03/seminar-on-makefiles-in-bioinformatics/

If you need to coordinate various shell scripts, it is better to use make.
Example of makefile:
"""
grep_results:
     grep ^x ......
     python ......

results.txt:
     python calculate_result.py ....

all: results.txt grep_result
"""
Typing "make all" on a shell will execute the instructions saved as
results.txt and then grep_results

Other advantages of make: conditional execution (avoid to regenerate
results which have already been calculated), standard syntax,
installed on most systems.


If you want to stay with python, there are tools already written in
this language.
For example: scons, paver, waf.
I can't help you with these because I don't use them.

Or, if you have time, it would be good to re-write biomake:
- http://skam.sourceforge.net/skam-intro.html
I can help you with that if you want.




>
> So, without further ado, I present a little helper called PipeChain.
> This allows this kind of code:
>
> chain = PipeChain('grep ^x', 'cut -d" " -f2-3', 'sort', 'uniq -c')
>
> # use on a file
> proc = chain(file('input1.txt), file('output1.txt'))
> proc.wait()
>
> # or get a handle on the results
> proc = chain(file('input1.txt))
> proc.stdout.read()
>
>
> I have attached the file. I need one bit of help though - it would be
> nice to be able to pass in an iterable to use as the input_handle.
> subprocess either wants a proper handle, or PIPE.  How can you stream
> an iterable into a subprocess.PIPE without blocking?
>
> cheers,
> James
>
> _______________________________________________
> biology-in-python mailing list - bip at lists.idyll.org.
>
> See http://bio.scipy.org/ for our Wiki.
>



-- 
Giovanni Dall'Olio, phd student
Department of Biologia Evolutiva at CEXS-UPF (Barcelona, Spain)

My blog on bioinformatics: http://bioinfoblog.it



More information about the biology-in-python mailing list