[TIP] Test coverage and parsing.

Tue Oct 6 05:42:24 PDT 2009

Hello !

2009/10/5 Michał Kwiatkowski <constant.beta at gmail.com>:
> On Mon, Oct 5, 2009 at 4:39 PM, Olemis Lang <olemis at gmail.com> wrote:
>> Well ... something like that. What I'd like to do (at least in my dreams) is :
>>
>>  - Provide a concrete result set
>>  - Perhaps classification (clustering) of data may be useful as
>>    an input to the subsequent steps ;o)
>>  - The «tool I'm looking for» will generate the following :
>>    * relevant SQL queries
>>    * stream of tokens (in order to test Pygments parser)
>>    * parsing results (not AST, for example if SQL query is
>>      `select col1, col2` then I just need to know that
>>      `['col1', 'col2']` are the columns that have been specified in
>>      SELECT statement )
>>    * the result of applying that query to the base result
>>  - Coverage and level-of-detail would be nice too ;o).
>
> Haskell world has a good approach to generating test data from a
> specification - the library is called QuickCheck. There is also a
> Python derivative, called PeckCheck
> (http://www.accesscom.com/~darius/software/clickcheck.html). Basically
> the trick is to have composable test data generators - once you have
> that you can easily express rules that apply to your code.
>

Well ... I'd just like to say something after my very short research ;o)

> To use your SQL example, you could say, using peckcheck syntax:
>
> class TestSql(TestCase):
>    def testColumns(self, columns=a_list(a_string)):
>        query = "SELECT %s FROM dual" % ', '.join(columns)
>        assert parse(query).columns == columns
>
> That basically checks that for all valid column lists your parser
> recognizes them correctly. It still forces you to reimplement your
> parse rules as tests, but at least you have a generation framework
> done for you.
>

Considering what I've read about `QuickCheck` (I've discovered that
I'm not an expert using Haskell :-/ ) it is able to generate test
cases out of systems specs. So I suppose that I'd need to implement
one such model for SQL (isn't it ?) so that SQL statements be
generated instead of unstructured data.

Q:
  - Since I've found nothing about that, and you mentioned composable
    operators and other features: Is it part of QuickCheck or is it an
    extension? If so, which one ?
  - Is it available too in the Py version?

PS :

1- It seems that CI environments are compulsory after adopting this
approach. I mean, how could you know if test data is good enough to
cover all relevant scenarios? What better tool than CI systems to
measure such metrics ;o) ?

2- This approach and these libs are both very useful. Shouldn't we
build at least a few of them and enhance support for this in CI tools
?

3- I've read about using this style together with CSP and other
techniques in order to test concurrent or real-time systems. Wow ! It
seems now I can remove this from my long TODO list ;o). For instance,
this could be very useful in order to write performance tests for Trac
plugins, by simulating concurrent access and then checking trace
assertions using the log as input (just like some Java libs do it
today ;o).

I like it ! Damn it ! I couldn't sleep last night :P

Thnx very, very, very much !

:o)

-- 
Regards,

Olemis.

Blog ES: http://simelo-es.blogspot.com/
Blog EN: http://simelo-en.blogspot.com/

Featured article:
Nabble - Trac Users - Coupling trac and symfony framework  -
http://feedproxy.google.com/~r/TracGViz-full/~3/hlNmupEonF0/Coupling-trac-and-symfony-framework-td25431579.html