[khmer] Guiding principles for khmer?

Matthew MacManes macmanes at gmail.com
Sat Nov 15 13:46:50 PST 2014


Titus, 

So much of your ethos is centered on “Open Science”. It would be nice to see this featured in the “khmer is" mission statement. E.g. Khmer is: an open source platform for..  In my mind, this is more than a goal, instead this is a core value that underpins the entire project. 

My 2 cents, 
Matt
______________________________________________
Matthew MacManes, Ph.D.
University of New Hampshire  I  Assistant Professor of Genome Enabled Biology
Department of Molecular, Cellular, & Biomedical Sciences
Durham, NH  03824
Phone: 603-862-4052  I  Twitter: @PeroMHC | Web: genomebio.org
Office: 189 Rudman Hall | Laboratory: 145 Rudman Hall

On November 15, 2014 at 9:55:16 AM, C. Titus Brown (ctb at msu.edu) wrote:

Hi all,  

as we think about the next few years of khmer development, I think it is helpful to explore what khmer is, roughly speaking, and what our goals should be.  

Here’s a rough cut; I’d like to turn this into a blog post, but only after some feedback from the list (if any).  

----  

khmer is:  

* a stable research platform for novel CS/bio research on data structures and algorithms, mostly k-mer based;  
* a test bed for software engineering practice in science;  
* a Python library for working with k-mers and graph structures;  
* an exercise in community building in scientific software engineering;  
* an exercise in ecosystem participation in scientific software engineering;  

----  

khmer long term goals, in some rough order of priority:  

* Keep khmer versatile and agile enough to easily enable the CS and bio we want to do. Practical implications: limit complexity of internals as much as possible.  

* Continue community building. Practical implications: run khmer as a real open source project, with everything done in the open; work nicely with other projects.  

* Build, sustain, and maintain a set of protocols and recipes around khmer. Practical implications: take workflow design into account.  

* Improve the efficiency (time/memory) of khmer implementations. Practical implications: optimize, but not at expense of clean code. Some specifics: streaming; variable sized counters.  

* Lower barriers to an increasing user base. Practical implications: find actual pain points, address if it’s easy or makes good sense. Some specifics: hash function k > 32, stranded hash function, integrate efficient k-mer cardinality counting, implement dynamically sized data structures.  

* Keep khmer technologically up to date. Practical implications: transition to Python 3.  

——  

Thoughts? What am I missing? What should be added or changed?  

cheers,  
—titus  


_______________________________________________  
khmer mailing list  
khmer at lists.idyll.org  
http://lists.idyll.org/listinfo/khmer  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/khmer/attachments/20141115/945f0aa2/attachment.htm>


More information about the khmer mailing list