Advogato: Blog for titus

Robert Jordan is ill

For some reason I hadn't heard this yet -- Robert Jordan has amyloidosis!

pinocchio oops

Max discovered pinocchio, my package of nose extensions, and then pointed out that I hadn't put any version info or contact info on my page for

pinocchio.

Oops. Fixed. Sorry 'bout that!

Code coverage

I've been using Ned Batchelder's code coverage module for a while now, and it's been great. We used a slightly hacked version for the agile testing tutorial, and now I need to do even more hacking on it.

I decided that rather than serially refactoring the code I'd swipe a few of the clever bits and do a complete rewrite. This effectively makes it a complete fork. I decided upon this tack because in my previous hacking I spent a lot of time struggling with the basic design of the module, and while the clever bits are pretty isolated and portable, the rest -- path munging, option handling, etc. -- is what I want to change in the first place.

Of course, immediately after deciding to steal some of the code, I ended up rewriting most of it. Sigh.

One of the main clever bits in coverage.py was the AST traversal code that decided which statements were potentially executable; this section used the compiler module. I'd heard somewhere that this module was deprecated, or unreliable, so I looked for some alternatives.

I put in some work on it last night, and arrived at the following function to extract interesting lines of code using tokenize:

class _TokeneaterObj:
    def __init__(self):
        self.lines = sets.Set()
        self.start_line = None
        self.ignore = (tokenize.COMMENT, token.NEWLINE,  token.INDENT,
                       token.DEDENT, token.ENDMARKER, tokenize.NL,
		       token.STRING)

     def tokeneater(self, *a):
        token_type, s, (srow, scol), (erow, ecol), logical_line = a

         if token_type == token.NEWLINE:
            if self.start_line is not None:
                self.lines.add(self.start_line)
                self.start_line = None

         elif token_type not in self.ignore:
            if self.start_line is None:
                self.start_line = srow


 def get_lines(fp):
    t = _TokeneaterObj()
    tokenize.tokenize(fp.readline, t.tokeneater)
    return t.lines

I don't know if this will be a good choice, long term. I have to write some tests... Any better ideas? (Let me know.)

My goals in this rewrite are a better interface for large projects & simplified filename handling. Switching to using sets and tokenize may be simple side-benefits, or perhaps costly diversions ;).

And then, the eternal dilemma -- what should I call it? Grig, my inane bozo of a friend, suggested 'figleaf'. I like it. (Runners up were 'blanket' (Diane) and 'wet blanket' (me).)

--titus

6 Jun 2006 titus » (Journeyer)