[TIP] Test coverage and parsing.

Mon Oct 5 09:59:05 PDT 2009

On Mon, Oct 5, 2009 at 4:39 PM, Olemis Lang <olemis at gmail.com> wrote:
> Well ... something like that. What I'd like to do (at least in my dreams) is :
>
>  - Provide a concrete result set
>  - Perhaps classification (clustering) of data may be useful as
>    an input to the subsequent steps ;o)
>  - The «tool I'm looking for» will generate the following :
>    * relevant SQL queries
>    * stream of tokens (in order to test Pygments parser)
>    * parsing results (not AST, for example if SQL query is
>      `select col1, col2` then I just need to know that
>      `['col1', 'col2']` are the columns that have been specified in
>      SELECT statement )
>    * the result of applying that query to the base result
>  - Coverage and level-of-detail would be nice too ;o).

Haskell world has a good approach to generating test data from a
specification - the library is called QuickCheck. There is also a
Python derivative, called PeckCheck
(http://www.accesscom.com/~darius/software/clickcheck.html). Basically
the trick is to have composable test data generators - once you have
that you can easily express rules that apply to your code.

To use your SQL example, you could say, using peckcheck syntax:

class TestSql(TestCase):
    def testColumns(self, columns=a_list(a_string)):
        query = "SELECT %s FROM dual" % ', '.join(columns)
        assert parse(query).columns == columns

That basically checks that for all valid column lists your parser
recognizes them correctly. It still forces you to reimplement your
parse rules as tests, but at least you have a generation framework
done for you.

Cheers,
mk