6 Jun 2006 titus   » (Journeyer)

Robert Jordan is ill

For some reason I hadn't heard this yet -- Robert Jordan has amyloidosis!

pinocchio oops

Max discovered pinocchio, my package of nose extensions, and then pointed out that I hadn't put any version info or contact info on my page for

pinocchio.

Oops. Fixed. Sorry 'bout that!

Code coverage

I've been using Ned Batchelder's code coverage module for a while now, and it's been great. We used a slightly hacked version for the agile testing tutorial, and now I need to do even more hacking on it.

I decided that rather than serially refactoring the code I'd swipe a few of the clever bits and do a complete rewrite. This effectively makes it a complete fork. I decided upon this tack because in my previous hacking I spent a lot of time struggling with the basic design of the module, and while the clever bits are pretty isolated and portable, the rest -- path munging, option handling, etc. -- is what I want to change in the first place.

Of course, immediately after deciding to steal some of the code, I ended up rewriting most of it. Sigh.

One of the main clever bits in coverage.py was the AST traversal code that decided which statements were potentially executable; this section used the compiler module. I'd heard somewhere that this module was deprecated, or unreliable, so I looked for some alternatives.

I put in some work on it last night, and arrived at the following function to extract interesting lines of code using tokenize:

class _TokeneaterObj:
    def __init__(self):
        self.lines = sets.Set()
        self.start_line = None
        self.ignore = (tokenize.COMMENT, token.NEWLINE,  token.INDENT,
                       token.DEDENT, token.ENDMARKER, tokenize.NL,
		       token.STRING)

def tokeneater(self, *a): token_type, s, (srow, scol), (erow, ecol), logical_line = a

if token_type == token.NEWLINE: if self.start_line is not None: self.lines.add(self.start_line) self.start_line = None

elif token_type not in self.ignore: if self.start_line is None: self.start_line = srow

def get_lines(fp): t = _TokeneaterObj() tokenize.tokenize(fp.readline, t.tokeneater) return t.lines

I don't know if this will be a good choice, long term. I have to write some tests... Any better ideas? (Let me know.)

My goals in this rewrite are a better interface for large projects & simplified filename handling. Switching to using sets and tokenize may be simple side-benefits, or perhaps costly diversions ;).

And then, the eternal dilemma -- what should I call it? Grig, my inane bozo of a friend, suggested 'figleaf'. I like it. (Runners up were 'blanket' (Diane) and 'wet blanket' (me).)

--titus

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!