28 Nov 2005 titus   » (Journeyer)

Tidy

I spent an hour or two on Sunday adding a tidy preprocessor into twill.

There are a lot of tidy Python implementations out there: ElementTree tidylib, ElementTree TidyTools, pyTidy, mxTidy, and

utidylib. Some of them (elementtree) are part of other packages or require stuff that I don't want to bundle or require (utidylib requires ctypes); most of them require the tidylib binary and then interface with it. Because I want the twill distro to be cross-platform, I decided to go with the approach taken in ElementTree TidyTools, which relies only on the command-line binary. Inspection of the code revealed that it simply executed os.system, without much in the way of error trapping, so I ended up rolling my own (search for 'run_tidy'). Whee.

So, in the next release of twill, it will automatically preprocess stuff with tidy unless you turn it off; you can also assert that pages have no 'tidy' warnings.

Eggz Rock

The (imminent) next release of twill, twill 0.8, will include support for Python Eggs.

When I started, I was worried about a few technical issues: for example, I include pyparsing and mechanize/ClientForm/ClientCookie/pullparser within the twill distribution, and then munge sys.path to load them first. How would this work with eggs? No problem; the same path-munging code works whether I'm loading from a directory a zip file. (I just use os.path.join.)

Version numbering: would upgrading etc. work nicely? Yep. The pkg_resources version handling is so smart, it's not even inspired. (...by which I mean that it's brilliantly simple.)

As a bonus, it will be even easier to distribute "development" versions of twill. I can just build an egg with an alpha version number, e.g. '0.8.1a1' or '0.8.1a2', link to 'em on a page, and then point people to that page. easy_install will do the rest. In fact, I don't even need to build the page manually: I can just tell Apache to make my development dist/ directory available to the public via "Options +Indexes".

For example, typing

easy_install -f http://issola.caltech.edu/~t/twill-dist/ twill

will automatically scan for the latest version and install it. Nifty.

So far, my main gripe? 'ez_setup' is an ugly name, and it's an ugly file to have sitting around in my main development directory. (You may recall that I dislike cluttering up the main directory. So call me picky ;).

--titus

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!