[TIP] Announce: Pygora 0.0.1

Ronny Pfannschmidt Ronny.Pfannschmidt at gmx.de
Sun Oct 17 08:36:30 PDT 2010


On Sat, 2010-10-16 at 13:33 -0400, Alfredo Deza wrote:
> 
> 
> On Sat, Oct 16, 2010 at 1:12 PM, Ronny Pfannschmidt
> <Ronny.Pfannschmidt at gmx.de> wrote:
>         
>         On Sat, 2010-10-16 at 12:11 -0400, Alfredo Deza wrote:
>         >
>         >
>         > On Fri, Oct 15, 2010 at 8:23 PM, Ronny Pfannschmidt
>         > <Ronny.Pfannschmidt at gmx.de> wrote:
>         >
>         >         On Sun, 2010-02-28 at 16:38 -0500, Alfredo Deza
>         wrote:
>         >         > After Ned Batchelder's talk in Python I spent a
>         few minutes
>         >         talking
>         >         > about how nice it is to have as much (or even
>         more!) test
>         >         code lines
>         >         > than source code lines.
>         >         >
>         >         > I have put together a small script that will count
>         the lines
>         >         of code
>         >         > in a given directory and will compare them to your
>         test code
>         >         lines
>         >         > with a nice output (e.g. percentage, total lines,
>         lines per
>         >         file
>         >         > etc...).
>         >         >
>         >         > Ideally, this could become a plugin for Nose (or
>         py.test),
>         >         but for
>         >         > now, it will create an executable symlink so you
>         can run it
>         >         anywhere.
>         >         >
>         >         > The project page: http://code.google.com/p/pygora/
>         >         >
>         >         > I am using distribute to package Pygora, you can
>         install it
>         >         by:
>         >         >
>         >         >         sudo easy_install pygora
>         >         >
>         >         > And, if you were not wondering already, the name
>         comes from
>         >         a *goat*:
>         >         > Pygora Goat is a cross between the Pygmy Goat and
>         the Angora
>         >         Goat that
>         >         > produces three distinct kinds of fleece and has
>         the smaller
>         >         size of
>         >         > the Pygmy.
>         >         > Source: Wikipedia
>         >         >
>         >         >
>         >         > Pygora could do better in telling what lines to
>         count with a
>         >         nice
>         >         > regex, but this first release will get you an
>         idea,
>         >         enhancements will
>         >         > follow.
>         >
>         >
>         >         you might want to take a look at py.countloc which
>         is shipped
>         >         by py (and
>         >         will be in pycmd in future)
>         >
>         >         py.countloc provides way better numbers (imho)
>         >
>         >
>         >
>         >
>         > Since I had py.test installed I tried out py.countloc, and I
>         have a
>         > few comments...
>         >
>         >
>         > The idea behind pygora is to have a comparison ratio, to
>         know where
>         > your test line numbers
>         > are related to those of your source code, this is not
>         provided by
>         > py.countloc
>         >
>         >
>         > In the latest pygora version (you have actually replied to
>         the first
>         > announcement dated in February)
>         
>         my misstake, i was in the middle of working trough the backlog
>         of the
>         ml, and mindlessly replied to the first mail instead of
>         finding the
>         latest one
> 
> 
> no big deal :)
>  
>         > speed was improved a lot, and in fact after benchmarking
>         both tools
>         > pygora takes almost
>         > half the time to go through a path and get the metrics.
>         >
>         >
>         > For example, in a directory with around 1,5 million lines of
>         code
>         > these are the times I got (best out of 3 runs):
>         >
>         >
>         > py.countloc  4.04s user 0.50s system 99% cpu 4.572 total
>         >
>         >
>         > pygora  2.52s user 0.66s system 97% cpu 3.253 total
>         
>         interesting timings, i think py.countloc was only optimized
>         for accurate
>         numbers, not raw speed, i'll take a look at that once my todo
>         is shorted
>         out
> 
> 
> Well, I think the speed improvements are a by-product of refactoring
> the previous version.
> 
> 
>  
>         >
>         >
>         > There is also no way to tell py.countloc to avoid printing
>         *every*
>         > single file it is counting. This is specially
>         > annoying for 1.5m lines of code, also, no way of filtering
>         or
>         > excluding certain files.
>         >
>         >
>         > All of those options are available with pygora (e.g. verbose
>         option,
>         > filtering etc...) although implementing
>         > them in py.countloc shouldn't be a hard thing to do.
>         >
>         >
>         > Having said that, I find that the source line detection in
>         py.countloc
>         > seems a bit more accurate. I initially had
>         > trouble in deciding for example if docstrings should count
>         as part of
>         > the source code or not (I ended including them).
>         
>         thats why i wanted to point you at it after trying a
>         easy_installed
>         pygora
>         >
>         >
>         > In my specific case, I like something fast that reports what
>         I need.
>         > If I want a ratio to be able to compare, this output
>         > seems too much for me:
>         >
>         >
>         >            number of testfiles 794
>         >  number of non-empty testlines 158805
>         >                number of files 6906
>         >      number of non-empty lines 1342471
>         >
>         >
>         > Compared to:
>         >
>         >
>         > Test lines    =     214669
>         > Source lines =    1241449
>         > Total lines =    1456118
>         >
>         >
>         > Your test code is 17.29% of your source code
>         >
>         
>         i guess i'm more interested in the other stats,
>         usually the ratio of test lines vs tested code lines
>         isn't that interesting (i tend to have more test lines than
>         code lines)
>         whats more interesting to me is the ratio of code/branch
>         coverage
> 
> 
> you mention "ratio" but I just don't see any ratio with py.countloc,
> unless
> I am missing something...

I declared it uninteresting,
to be honest *I* consider it a false/miss-leading metric
(just about as wrong as lines of code for programmer productivity)

although its occasional interesting to know just for deciding where to
go in terms of code/test size vs features


> 
> 
> I'm also confused here about "code/branch coverage". What do you mean?
> this is *not* 
> a coverage tool, it just counts lines and compares them. 

sorry, I didn't mean to drag in the tooling types,
just express what *I* consider the 
more important metrics around code and tests
(obviously they are solved by a different kind of tool)

currently *I* consider the code line/branch/state coverage
of a test-suite the most important metric,
(assuming correctness)
code size is secondary, although as usual the smaller the better

regards Ronny
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part
URL: <http://lists.idyll.org/pipermail/testing-in-python/attachments/20101017/7718d9de/attachment.pgp>


More information about the testing-in-python mailing list