A Java coder admits to liking Python: Simon sez. But...
if the problem is complex, shouldn't the tools you use to solve the problem
also be complex?
Here's a (recently rare) foray into politics for ya, just to keep you on
In response to a
post to IP by Hiawatha
Bray, I offered
my reasons for disliking both the Bush Administration's approach
to the War on Terror, and the naive "Golden Rule"-ism of some
liberals. (Incidentally, I count myself as a liberal in these
Charlie Stross (a fantastic sci-fi writer) weighed
in on the absurd "asymmetric warfare"</a> statement made about the
four Gitmo suicides.
Authoring for O'Reilly
spoke, so I'll confirm: Grig, Jason and I have contracted to write
a series of short
PDFs for O'Reilly.
The umbrella topic is 'Testing Stuff', and a brief list of individual
topics we hope to cover includes: Intro to Web Testing (covering twill
& intro Selenium); Advanced Web Testing (some twill, lots more
Selenium); Unit Testing in Python; Continuous Integration with
buildbot; Python Testing with FitNesse; and perhaps more. Each book
will cover a discrete chunk of material, and we hope the series will
pull the individual books together into a whole, as well.
The first two, together with a (free) introductory book on testing, should
be available within 3 months.
(And remember, Gentlemen prefer PDFs!)
When last we met, I professed a rewrite
of coverage.py. Since then, I realized that my tokenize-based
method of extracting interesting lines of code ... didn't work. There
were several situations where lines of code simply wouldn't be
counted, because there wasn't enough context to determine whether or
not it was an actual expression. Specifically, this kind of code
broke the parser:
(lambda x: x + 1)(1),
y: y * 2)
After flailing a bit, I realized that you really needed the AST to properly
determine what lines of code are worth counting. This realization was
helped by the fact that the sys.settrace 'line' tracing function is only called
on the 'lambda' and 'def' lines, above, and not on the 'a=' and 'b=' code.
(Kudos to Ned Batchelder for including so many nasty evil tests with
coverage.py -- I just stole his code. ;)
I delved into coverage.py and confirmed my suspicion that some nasty
AST visitation was occurring, using code based on the 'compiler' package.
Moreover, coverage.py used code way beyond me to determine what was actually
an executable line of code... and then did even more clever things to
count that code even when the sys.settrace function didn't hit it.
Now, my rule is, if it's too complicated for me to understand, it
shouldn't be in software I write. So I set myself a new goal: make
a really really simple coverage-measuring utility that (a) only
counts lines that Python actually "executes" (as measured by sys.settrace);
and (b) I can understand.
In the process of working on implementing this with the parser module, I
discovered a few amusing details about Python. First: can you guess
which lines of code are "executed" in the following?
Well, in a bit of a surprise to me, it turns out that only the numbers
>- def f():
(where '>-' represents executed lines). Yep, only the numbers, not the
stringS! There are two reasons for this,
I think: one is that each number is actually a numerical expression, to
be evaluated and replaced by its value, while strings are just literals;
and the other is that this doesn't count docstrings.
I'm also a bit surprised by some aspects of the AST that
is generated, too. For example, here's what my AST pretty-printer
outputs for the number "5", all alone in a file:
NUMBER ('5', 1)
NEWLINE ('', 1)
NEWLINE ('', 1)
ENDMARKER ('', 1)
Is this really a necessary part of the AST?
I clearly need to read up on this more ;).
The last thing I did was build in an optimization: coverage.py uses
a global trace function that is continually reassigned to the local
trace function by calls into new code blocks. This means that all
Python code is traced. Figuring that this would be kind of a speed
drain, I separated out the logic into a global trace function that
only set the local trace function on a call into interesting code,
where "interesting" could be specified by the user. In other words,
rather than tracing coverage on everything, only code executing
in user-specified modules would be traced.
The early results are pretty positive. With the usual caveats about
naive benchmarking -- which this certainly is -- I found the following
times for running the twill
tests in nose:
- coverage.py -- 30 seconds total
- figleaf.py, tracing *all* code -- 23 seconds total
- figleaf.py, tracing *only* user code -- 15 seconds total
- no coverage analysis at all -- 7 seconds total.
Naively, it looks like I get a ~20% speedup from switching
to my naive AST implementation, and I get another ~25% speedup
from only looking at local code. Neat, huh? (Sadly, you still
lose a factor of 2 because of the code coverage!)
Here's my (hideously ugly and un-re-factored, yes) implementation of
a class to turn code into "interesting line numbers":
def __init__(self, fp):
self.lines = sets.Set()
ast = parser.suite(fp.read())
tree = parser.ast2tuple(ast, True)
def find_terminal_nodes(self, tup):
Recursively eat an AST in tuple form, finding the first line
number for "interesting" code.
(sym, rest) = tup, tup[1:]
line_nos = 
if type(rest) == types.TupleType: ### node
for x in rest:
min_line_no = self.find_terminal_nodes(x)
if min_line_no is not None:
if symbol.sym_name[sym] in ('stmt', 'suite', 'lambdef',
'except_clause') and \
# store the line number that this statement started at
else: ### leaf
if sym not in (token.NEWLINE, token.STRING, token.INDENT,
## use like so:
lines = _LineGrabber(open(filename)).lines
I will be eternally grateful to anyone who points out why this is a stupid
way to do things, and/or can improve the logic. (I already know it's
I'll post the full figleaf module sometime soon; right now it's too dangerous
to let loose on the Internet. If you're willing to handle such dangerous
material, just drop me a line.