[TIP] Announce: Pygora 0.0.1

Sat Oct 16 09:11:50 PDT 2010

On Fri, Oct 15, 2010 at 8:23 PM, Ronny Pfannschmidt <
Ronny.Pfannschmidt at gmx.de> wrote:

> On Sun, 2010-02-28 at 16:38 -0500, Alfredo Deza wrote:
> > After Ned Batchelder's talk in Python I spent a few minutes talking
> > about how nice it is to have as much (or even more!) test code lines
> > than source code lines.
> >
> > I have put together a small script that will count the lines of code
> > in a given directory and will compare them to your test code lines
> > with a nice output (e.g. percentage, total lines, lines per file
> > etc...).
> >
> > Ideally, this could become a plugin for Nose (or py.test), but for
> > now, it will create an executable symlink so you can run it anywhere.
> >
> > The project page: http://code.google.com/p/pygora/
> >
> > I am using distribute to package Pygora, you can install it by:
> >
> >         sudo easy_install pygora
> >
> > And, if you were not wondering already, the name comes from a *goat*:
> > Pygora Goat is a cross between the Pygmy Goat and the Angora Goat that
> > produces three distinct kinds of fleece and has the smaller size of
> > the Pygmy.
> > Source: Wikipedia
> >
> >
> > Pygora could do better in telling what lines to count with a nice
> > regex, but this first release will get you an idea, enhancements will
> > follow.
>
> you might want to take a look at py.countloc which is shipped by py (and
> will be in pycmd in future)

> py.countloc provides way better numbers (imho)
>
>
Since I had py.test installed I tried out py.countloc, and I have a few
comments...

The idea behind pygora is to have a comparison ratio, to know where your
test line numbers
are related to those of your source code, this is not provided by
py.countloc

In the latest pygora version (you have actually replied to the first
announcement dated in February)
speed was improved a lot, and in fact after benchmarking both tools pygora
takes almost
half the time to go through a path and get the metrics.

For example, in a directory with around 1,5 million lines of code these are
the times I got (best out of 3 runs):

py.countloc  4.04s user 0.50s system 99% cpu 4.572 total

pygora  2.52s user 0.66s system 97% cpu 3.253 total

There is also no way to tell py.countloc to avoid printing *every* single
file it is counting. This is specially
annoying for 1.5m lines of code, also, no way of filtering or excluding
certain files.

All of those options are available with pygora (e.g. verbose option,
filtering etc...) although implementing
them in py.countloc shouldn't be a hard thing to do.

Having said that, I find that the source line detection in py.countloc seems
a bit more accurate. I initially had
trouble in deciding for example if docstrings should count as part of the
source code or not (I ended including them).

In my specific case, I like something fast that reports what I need. If I
want a ratio to be able to compare, this output
seems too much for me:

           number of testfiles 794
 number of non-empty testlines 158805
               number of files 6906
     number of non-empty lines 1342471

Compared to:

Test lines    =     214669
Source lines =    1241449
Total lines =    1456118

Your test code is 17.29% of your source code

> >
> >
> > Thanks for an awesome an inspiring talk Ned!
> >
> >
> >
> >
> > --
> > Alfredo Deza
> >
> > _______________________________________________
> > testing-in-python mailing list
> > testing-in-python at lists.idyll.org
> > http://lists.idyll.org/listinfo/testing-in-python
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/testing-in-python/attachments/20101016/f5eaaf15/attachment.html>