[TIP] Pandokia - more info
sienkiew at stsci.edu
Thu Apr 9 15:02:39 PDT 2009
Doug Philips wrote:
> From what I've read, the set of possible fields is fixed by the DB back-end that is being used, but for the purposes of structure and parsing, the fields are named by convention?
There are field names that are fixed: test_name, test_run, status, etc
have specific meanings.
Other field names are chosen by the test author. For example, any name
starting "tda_" is a test definition attribute, so the number of
possible names is huge. The number of attributes that you report for a
specific test is also arbitrary.
In my nightly test runs, there are attributes that are reported by many
tests and other attributes that are reported by few tests. Some tests
report no attributes, other report as many as 20-something.
> The only simplification I've thought of so far is to reduce the variability. Not sure I like where that is headed, except perhaps the that the destination would be the very minimalistic: <test #> : <status>
All you have to do is decide which information you would like to do
Actually, if you look at the table result_scalar at
http://stsdas.stsci.edu/pandokia/initdb.sql , you see what I had left
after I trimmed the record about as much as I can. ( The other record
fields not in that table are the TDA and TRA attributes, and "log" which
is basically stdout/stderr from running the test. )
The 4 test identity fields (test_run, host, project, test_name) and the
status are required. Everything else is optional. I need all the
information that the file format can convey, but if you don't, your test
runner does not have to report it and/or your test reporting system does
not have to receive it. The only penalty for not collecting the
information is that you don't have it. :)
(Let me mention here that "host" is actually a poor name for "execution
environment", and we plan to fix that. It just happens that I have a
distinct machine for each of my 8 execution environments.)
> One of the (personal) amusements for me at PyCon 2009 was the density of invocation/reference to json. XML without the sharp corners as I overheard more than once. This protocol seems like a minor simplification of json, and I wonder if using json would be simpler (re-use) or not (thinking out loud here...)?
I really like JSON, especially if the alternative is XML. JSON is a
format for storing record-structured data, not a markup language, so
everything about it is designed for "field = value". It is way cleaner
than XML. Just what I need, right? Well, except that it is a bit more
complicated to read than bunch of lines that say "name=value\n". I
didn't see that buzzword compliance was worth the added complexity (of
both the reader and writer).
Of course, this format is easy to parse because the pandokia record is
flat. If it contained nested objects, lists, and so on, JSON would be
the way to go.
>> I'm not that quick to look for a database either. I've lost count of
>> how many times I've passed on even evaluating a free software package
>> just because it needed a working mysql to install it.
> Yup, done the same. It isn't fear of databases. It is "why do I need to bother with that?"
Precisely. And it isn't just databases. If you need a new windowing
toolkit that comes as a 45 meg source distribution, I'm gone too.
> However, I'm most interested in the protocol at the moment, finding something that will be generally agreed on so that "we all" can go off and build toolchains on top of.
In that case, we should talk off-list about what fields I included and
why, and you can tell me what you do/don't need.
>> I also expect that pandokia will be adaptable to other database
>> engines. This may be an advantage to somebody who has large user base.
> Or maybe even not use a DB at all. At work we already have a web/GUI "thang" that other teams use to report their test results. I haven't looked at the details of it, but my goal is to consider that as -one- potential downstream target.
I mostly intended to receive reports from any possible test runner, but
that is because I have multiple test runners to deal with. If you have
a working reporting system that already meets your needs, it makes sense
to use it.
More information about the testing-in-python