[TIP] Result protocol / parsers, wire format

Mark Sienkiewicz sienkiew at stsci.edu
Mon Apr 13 09:33:10 PDT 2009


Jesse Noller wrote:
> On Sat, Apr 11, 2009 at 11:36 PM, Douglas Philips <dgou at mac.com> wrote:
>
>   
>> The Pandokia format is very close to .ini style. Hypothetical query:
>> if it were .ini, would that make any difference (it wouldn't be custom
>> any more, n'est-ces pas?)
>>
>>     
>
> And INI files have some annoying as hell restrictions, YAML support
> conversion from items within the file to native data types, such as
> lists, ints, and so on. I don't want to have to parse it and do the
> conversion myself. Many people have written custom parsers to do this
> conversion for you, such as Michael (configobj) but again, why not use
> something which support conversion to native data types so that the
> job is easier on the parser and emitter?
>   

The pandokia format actually has no relationship at all to .ini files.  
It is just coincidence that both formats are 1) line oriented, and 2) 
contain lines that look like "name=value".

I chose it for these reasons:

1. Extremely simple to parse.  Compare with XML, where your first step 
is to spend 2 days trying to figure out how to use the XML library.

2. Files can be concatenated with standard tools like cat.

3. If you have a partial file written by a crashed test runner, you can 
append another file, and you can still read everything except the 
incomplete record at the end of the corrupted file.

4. It can contain field names that are chosen at run time by the test.

Currently, it only contains text fields except for start_time and end_time.

For my purposes, JSON and YAML are a lot like pandokia's format, except:

- braces before/after each record, quotes around strings

- poor/no error recovery

- nested records (which I do not need)

- data types (though the only data type in JSON that I used is "string")

I looked at yaml.org and followed a few links there to find out about 
YAML.  My initial impression is YAML has these drawbacks:

- "Warning: It is not safe to call yaml.load with any data received from 
an untrusted source!" (http://pyyaml.org/wiki/PyYAMLDocumentation)  This 
differs from JSON, which is always safe to load.

- YAML seems to have more of a learning curve associated with it.  At 
least, that is my impression from looking over the yaml and json web 
sites.  (Maybe the yaml documentation just isn't as clear.)

If YAML has reasonable error recovery, it might be worth the learning 
curve.  Otherwise, JSON might be more appropriate, just for the 
simplicity.  Of course, YAML parsers can read JSON, so if we emit JSON, 
then a YAML application can read it.

Mark S.




More information about the testing-in-python mailing list