[TIP] Pandokia - more info

Mark Sienkiewicz sienkiew at stsci.edu
Thu Apr 9 15:02:39 PDT 2009

Doug Philips wrote:
> From what I've read, the set of possible fields is fixed by the DB back-end that is being used, but for the purposes of structure and parsing, the fields are named by convention?

There are field names that are fixed: test_name, test_run, status, etc 
have specific meanings.

Other field names are chosen by the test author.  For example, any name 
starting "tda_" is a test definition attribute, so the number of 
possible names is huge.  The number of attributes that you report for a 
specific test is also arbitrary.

In my nightly test runs, there are attributes that are reported by many 
tests and other attributes that are reported by few tests.  Some tests 
report no attributes, other report as many as 20-something.

> The only simplification I've thought of so far is to reduce the variability. Not sure I like where that is headed, except perhaps the that the destination would be the very minimalistic: <test #> : <status>

All you have to do is decide which information you would like to do 
without. :)

Actually, if you look at the table result_scalar at 
http://stsdas.stsci.edu/pandokia/initdb.sql , you see what I had left 
after I trimmed the record about as much as I can.  ( The other record 
fields not in that table are the TDA and TRA attributes, and "log" which 
is basically stdout/stderr from running the test. )

The 4 test identity fields (test_run, host, project, test_name) and the 
status are required.  Everything else is optional.  I need all the 
information that the file format can convey, but if you don't, your test 
runner does not have to report it and/or your test reporting system does 
not have to receive it.  The only penalty for not collecting the 
information is that you don't have it. :)

(Let me mention here that "host" is actually a poor name for "execution 
environment", and we plan to fix that.  It just happens that I have a 
distinct machine for each of my 8 execution environments.)

> One of the (personal) amusements for me at PyCon 2009 was the density of invocation/reference to json. XML without the sharp corners as I overheard more than once. This protocol seems like a minor simplification of json, and I wonder if using json would be simpler (re-use) or not (thinking out loud here...)?

I really like JSON, especially if the alternative is XML.  JSON is a 
format for storing record-structured data, not a markup language, so 
everything about it is designed for "field = value".  It is way cleaner 
than XML.  Just what I need, right?  Well, except that it is a bit more 
complicated to read than bunch of lines that say "name=value\n".  I 
didn't see that buzzword compliance was worth the added complexity (of 
both the reader and writer).

Of course, this format is easy to parse because the pandokia record is 
flat.  If it contained nested objects, lists, and so on, JSON would be 
the way to go.

>> I'm not that quick to look for a database either.  I've lost count of 
>> how many times I've passed on even evaluating a free software package 
>> just because it needed a working mysql to install it.
> Yup, done the same. It isn't fear of databases. It is "why do I need to bother with that?" 

Precisely.  And it isn't just databases.  If you need a new windowing 
toolkit that comes as a 45 meg source distribution, I'm gone too.

> However, I'm most interested in the protocol at the moment, finding something that will be generally agreed on so that "we all" can go off and build toolchains on top of.

In that case, we should talk off-list about what fields I included and 
why, and you can tell me what you do/don't need.

>> I also expect that pandokia will be adaptable to other database 
>> engines.  This may be an advantage to somebody who has large user base.
> Or maybe even not use a DB at all. At work we already have a web/GUI "thang" that other teams use to report their test results. I haven't looked at the details of it, but my goal is to consider that as -one- potential downstream target.

I mostly intended to receive reports from any possible test runner, but 
that is because I have multiple test runners to deal with.  If you have 
a working reporting system that already meets your needs, it makes sense 
to use it.

Mark S.

More information about the testing-in-python mailing list