[TIP] branch coverage

Mon Jan 28 16:57:25 PST 2008

I have a preliminary PLY parser for Python.  I'm working on the code  
instrumentation part.  I'm not sure about what to trace, and am  
looking for feedback.

Take this as an example

1 + (a or b) + stop() + 1/0; print "Done."

def stop(): raise SystemExit()

The code ' + 1/0; print "Done."' never gets executed because the stop 
() raises an exception

There seem to be a few things I can trace.  These are:

  - statement is reached (but not necessarily fully executed)

reached("expr-statement line 1 (0:27)")
1 + (a or b) + stop() + 1/0
reached("print-statement line 1 (29:40)")
print "Done."

This is comparable to what's done now, with the advantage that I can  
instrument the .pyc file and do full coverage of Python standard  
library components like 'os.py'.

   - track the start of all normal code execution branches (if,while,  
short-circuit, etc.)

1 + ($(a:or-test 5:6) or $(b:or-test 10:11) + stop() + 1/0
print "Done."

where the $() is a syntax I just made up, but I hope it's  
understandable.  If you want it expressed in Python code, it's  
something like:

def _trace_expr1():
     stack = []
     stack.append(1)
     stack.append(a)
     if not stack[-1]:
       was_false("or-test 5:6")
       stack[-1] = b
       if not stack[-1]:
         was_false("or-test 10:11")
         stack[-1] = False
       else:
         was_true("or-test 10:11")
     else:
       was_true("or-test 5:6")

     stack[-2:] = [stack[-2]+stack[-1]]
     stack[-1] = stack[-1] + stop() + 1/0
     return stack[-1]
_trace_expr1()

   - Every operation is completed

    reached("expr-statement line 1 (0:27)")
    stack = []
    stack.append(1)
    stack.append(a)
    success("name line 1 (5:6)")
    if not stack[-1]:
      stack[-1] = b
      success("name line 1 (10:11)")
      if not stack[-1]:
        stack[-1] = False

    stack[-2:] = [stack[-2] + stack[-1]]
    success("add line 1 (0:12)")

    stack.append(stop())
    success("call line 1 (15:21)")
    stack.append(1)
    stack.append(0)
    stack[-2:] = [stack[-2] / stack[-1]]
    success("div line 1 (24:26)")
    stack[-2:] = [stack[-2] + stack[-1]]
    success("add line 1 (0:26)")
    stack.pop()
    finished("expr-statement line 1 (0:27)")

This is the only one that can report that stop() was called, but that  
+ 1/0 never occurred.

This much instrumentation will cause a huge performance hit because  
each of the logging calls corresponds to a function call.

Another problem with the last is that sometime you *want* functions  
to raise an exception, as with:

opt_parser.error("cannot mix --oil and --water command-line parameters")

or in this somewhat unusual case (and IMO poor style)

    a = x or y or die("either 'x' or 'y' must be true")

Exceptions will likely cause a lot of false positives, and I don't  
like that.  It means people won't use this sort of tool.

The short version of this might be

Given the Python program

   a = 0
   try:
     a or b()
   except NameError:
     pass

how important is it to notice that the () in b() is never called?

Hmm... I could have something which reports that the *last*  
instruction in a statement is executed, and just not care if it  
throws an exception.  Nahh, I need to do that on every branch, so  
that the end of all branches of

   1 + (a or b()) + (c or d())

is checked.

Does anyone have experience and comments about this?

				Andrew
				dalke at dalkescientific.com