I spent an hour or two on Sunday adding a tidy preprocessor into twill.
There are a lot of tidy Python implementations out there: ElementTree tidylib, ElementTree TidyTools, pyTidy, mxTidy, and
utidylib. Some of them (elementtree) are part of other packages or require stuff that I don't want to bundle or require (utidylib requires ctypes); most of them require the tidylib binary and then interface with it. Because I want the twill distro to be cross-platform, I decided to go with the approach taken in ElementTree TidyTools, which relies only on the command-line binary. Inspection of the code revealed that it simply executed os.system, without much in the way of error trapping, so I ended up rolling my own (search for 'run_tidy'). Whee.
So, in the next release of twill, it will automatically preprocess stuff with tidy unless you turn it off; you can also assert that pages have no 'tidy' warnings.
Eggz Rock
The (imminent) next release of twill, twill 0.8, will include support for Python Eggs.
When I started, I was worried about a few technical issues: for example, I include pyparsing and mechanize/ClientForm/ClientCookie/pullparser within the twill distribution, and then munge sys.path to load them first. How would this work with eggs? No problem; the same path-munging code works whether I'm loading from a directory a zip file. (I just use os.path.join.)
Version numbering: would upgrading etc. work nicely? Yep. The pkg_resources version handling is so smart, it's not even inspired. (...by which I mean that it's brilliantly simple.)
As a bonus, it will be even easier to distribute "development" versions of twill. I can just build an egg with an alpha version number, e.g. '0.8.1a1' or '0.8.1a2', link to 'em on a page, and then point people to that page. easy_install will do the rest. In fact, I don't even need to build the page manually: I can just tell Apache to make my development dist/ directory available to the public via "Options +Indexes".
For example, typing
easy_install -f http://issola.caltech.edu/~t/twill-dist/ twill
will automatically scan for the latest version and install it. Nifty.
So far, my main gripe? 'ez_setup' is an ugly name, and it's an ugly file to have sitting around in my main development directory. (You may recall that I dislike cluttering up the main directory. So call me picky ;).
--titus
