[bip] Reproducible research
Ivan Rossi
ivan at biodec.com
Thu Mar 5 01:25:18 PST 2009
On Wed, 4 Mar 2009, C. Titus Brown wrote:
> On Wed, Mar 04, 2009 at 06:56:12PM +0100, Giovanni Marco Dall'Olio wrote:
> <rant>
> That's because nobody cares if your code is right, just that you can
> publish a good paper on the results from running it. Once.
> </rant>
>
> Unfortunately doing both good code + good publication is an awful lot of
> hard work, so if it's not rewarded *shrug*.
Amen, Brother Titus.
That's why some famous an not-so famous journals are so full of sh^H^H
results-on-well-tempered-data-sets to be close to useless many many times.
We had to re-implement things and test them over and over in order to get
honest performance. An in the end most of the time is better to get several
raw data sets and work on them yourself honestly (properly remove
redundancy, use training, testing, and validation sets...) even without
worrying in inventing new methods.
E.g. For many methods published in sequence analysis, it should be
mandatory for the referees to ask the question: how much better you do than
blast itself on this task? It is always surprising to see how many things
are NOT better.....
Although I should not complain too much, this is exactly the reason why
companies hire us: to shovel out bullshit. &;)
--
Ivan Rossi, PhD - ivan AT biodec dot com - ivan dot rossi3 AT unibo dot it
BioDec Srl, Via Calzavecchio 20/2, 40033 Casalecchio di Reno (BO), Italy
Phone: (+39)-051-0548263 - Fax: (+39)-051-7459582 - http://www.biodec.com
More information about the biology-in-python
mailing list