Older blog entries for titus (starting at number 140)

(Sorry for the long post, I'll soon start putting "articles" on an articles page instead of posting them.)

New net_flow package

My adventures in trustiness continue: Raph Levien, the author of the net_flow C code & the admin of the advogato site, gave me the actual configured capacities of his site. Woot. The code seems to reproduce the actual certification levels of advogato with fairly good accuracy; I suspect my scrape was a bit out of date, which could account for the discrepancies. (I nail the robots.net certifications exactly, and that site is a bit less active, I think.)

Grab it here.

New development server & tools setup: part 1 of inf.

I finally took the plunge and ordered another server from JohnCompanies. I went with the lowest-cost Linux VPS, so I share a machine via virtualization, with 256 mb of RAM guaranteed (up to 1 gb burst), 40 gb of bandwidth a month, and 5 gb of disk space. I ordered it with Debian 3.1. This machine adds to my small collection of servers: I already have a FreeBSD server with them, and I also run a local development server for my lab (RH 7!), an e-mail & Web page server for my family and friends (Debian), and a home MythTV server (Debian). I'm hoping to swap up the servers so that the e-mail server is moved onto a JohnCompanies server; that way I won't have to worry about hardware or backups.

I've had the FreeBSD server with JohnCompanies for several years now, and I share the cost with several people. (The total is something like $40/mo., after a discount because of my open-source work.) JohnCompanies is really great; I get the sense that they know UNIX as well as I do, if not better. Unlike me, however, they have the focus to actually do a good job of network adminning ;). They're incredibly responsive and their servers Just Work. It's exactly what I need. (Although if anyone has any suggestions for similarly priced off-shore servers, I'd appreciate a tip. I'd rather our government be able to legally wiretap me without a warrant, you see ;).

This new server is going to be a svn, darcs, Trac, Web site, domain, and whatnot hosting system. I'm hoping to make it into a bit of a co-op, where friendly like-minded people can host projects; part of the project will be to produce infrastructure to run all this stuff nicely. Grig is already planning to host two domains there. If you're a Python OSS developer who doesn't bite and wants a root shell, Trac+ setup, and a Python-friendly installation, e-mail me to join up. (Note, this is not-for-profit; we're just trying to share out the costs of all our little projects. It may even end up being free, if I can scrape together sufficient google ads revenue over the next few months)

After JC set up the machine for me, I had a fun-filled evening of setting up software. Briefly, I:

  • upgraded to Debian 'testing' from stable;
  • installed SCGI for Apache 2.x;
  • installed tailor;
  • set up Trac with the WSGI patches.

I had such a good time doing this that I thought I'd give a little "tutorial-style" introduction to two aspects of the setup for which I found relatively little documentation on the 'net.

Using easy_install to manage your Python packages

This is so easy that it's almost silly to write a HOWTO, but ... I didn't realize how easy it was 'til I'd done it ;).

Motivation:

I'd like to give users on this machine the ability to use multiple versions of a Python package.

Solution:

Use easy_install to install everything. easy_install will build eggs for each version of each package and allow the import of specific, distinct versions through the 'require' function.

To install easy_install, I downloaded ez_setup.py and ran it:

python ez_setup.py

This downloaded and installed setuptools.

To set up multiple versions for multiple Python versions, you can do something like this:

# install for 2.4
python2.3 ez_setup.py
mv /usr/bin/easy_install /usr/bin/easy_install2.3

# install for 2.4 python2.4 ez_setup.py mv /usr/bin/easy_install /usr/bin/easy_install2.4

# default to 2.3 ln -fs /usr/bin/easy_install2.3 /usr/bin/easy_install

You can now use easy_install to install all of the packages that you don't want managed by apt-get (or your package manager of choice).

To install the latest version of a package via PyPi:

easy_install package_name
e.g.

easy_install twill

will install twill 0.8.1.

To install a .tar.gz that may or may not know about setuptools, try:

wget http://darcs.idyll.org/~t/twill-latest.tar.gz
easy_install twill-latest.tar.gz

This will install the latest development version of twill.

To install a specific egg:

wget http://issola.caltech.edu/~t/dist/CherryPy-2.1.0-py2.3.egg
easy_install CherryPy-2.1.0-py2.3.egg

And, finally, you can use to install files directly from an untarred distribution or a repository:

darcs get --partial http://darcs.arstecnica.it/tailor
easy_install tailor

easy_install will go grab tailor/setup.py, grok it, and install both the library code and the scripts.

easy_install: it's that easy ;). Note that I have yet to run into any problems with using it on local files; occasionally it fails to find PyPi packages, or does strange things while scanning random Web pages.

So what drawbacks are there to easy_install? I've only run into two problems: one is that automated tests may not work for packages installed as eggs, e.g. 'tailor test' doesn't work. The other is that 'help' apparently doesn't work, either. Neither are big problems for me, because I don't use 'help' much (emacs can browse zip/egg files just fine, and I prefer reading the source code) and I'm not developing on 'tailor'.

I'll describe using the Python API to setuptools to import only a specific version of packages in a bit; haven't found the online docs for that, otherwise I'd just link to it ;).

Subversion to Darcs Mirroring with Tailor

The back story:

Devoted readers may recall that I complained about having to maintain my own set of patches to Trac. At the time I said I didn't want to have to convert the Trac Subversion archive to darcs (my versioning system of choice) with tailor, and I also didn't want to fork the Trac code.

A few days ago, Stan Seibert e-mailed me to point out that SVK does a fine job of making "private" copies of svn archives, although in further conversation we agreed that it might not be the best way to make your changes available to other people. (I referred to this practice as a "patch stream", and it's something darcs does very well.) Another obstacle was that SVK required a certain amount of Perl-fu, and my development machine wasn't package-managed any more.

Long story short, I decided to go with tailor on my new Debian server. It's written in Python, it's maintained in darcs, and (best of all!) the author uses it to maintain his own patchset for Trac.

Actual information:

First, I installed tailor:


darcs get --partial http://darcs.arstecnica.it/tailor easy_install tailor

I then upgraded subversion and darcs from Debian stable to Debian testing. You can do this in several ways with Debian; I chose to reset my system-wide preferences to testing and then do an 'apt-get dist-upgrade', but the simplest way to do it is this, I believe:

apt-get install -t testing svn darcs

(Warning: if you don't upgrade subversion, your first tailor attempt will end with the error that --limit is an unknown flag to svn.)

At this point you're pretty much ready to run tailor, believe it or not! The tailor README advises you to specify command-line arguments corresponding to the configuration you want, and then just use the '--verbose' flag to output a configuration file. That's what I did, but you still need to read quite a bit to figure out exactly what options to use. Luckily for you I've already done the reading -- and here's the configuration to pull a remote SVN repository into Darcs.

File 'trac.tailor':

[DEFAULT]
verbose = True
# CTB *4*
encoding-errors-policy = ignore

[project] target = darcs:target start-revision = INITIAL # CTB *1* root-directory = /tailor/trac state-file = tailor.state source = svn:source subdir = trac

[darcs:target] # CTB *2* repository=/tailor/trac/trac

[svn:source] # CTB *3* module = /trunk repository = http://svn.edgewall.com/repos/trac

There are four configuration points in this file that need discussion.

  1. First, at CTB *1*, the root-directory. Tailor puts all of the files, including the repository, in this directory. It must be writeable by the user running tailor.

  2. Second, at CTB *2*, the target repository directory. As far as I can tell, this is entirely ignored.

  3. Third, at CTB *3*, you need to specify the source repository independently of the module that you're importing. The module doesn't need to be a top-level module, either.

  4. Fourth, at CTB *4*, you will get a unicode exception error midway through your import of Trac if you don't tell it to ignore encoding errors. I'm not sure how to fix this.

Once this configuration file is in place, run tailor trac.tailor, and watch & wait. (It took tailor about 30 minutes to pull in the entire Trac repository of over ~2000 patches -- not extraordinarily fast, but you only need to it once.)

At this point, you have a fully-functional darcs repository. I don't plan to modify it directly -- after all, I don't have check-in access to the Trac archive! -- but you can pull the darcs repository as usual:

darcs get /tailor/trac/trac

and work off of the downstream archive.

In summary, tailor "just worked". Try it out yourself!

I'll make my trac/darcs repository (with the WSGI/SCGI patches in it) available soon; there's still some machine configuration to do before I give out the hostname.

cheers,

--titus

More trustiness

On Sunday, I hacked together a quick Python wrapper for raph's 'net_flow' implementation. Another hour or two of hacking has produced a Python implementation of the advogato trust metric (which actually consists of three distinct trust flows).

Steven Rainwater of robots.net

graciously gave me access to his actual robots.net configuration file, and I verified that my net_flow Python code reproduces the actual robots.net certifications. So I'm fairly sure that the code functions properly -- this isn't too surprising, because it really is a very simple wrapper around raph's code.

In any case, you can download a tar.gz here; or, visit the README. It's fun to play with... note that the distribution contains HTML-scraped advogato and robots.net certifications from this Monday, so you can play with the actual network yourself. (Please don't scrape the sites yourself without asking raph or Steven; yes, I transgressed with advogato, but that doesn't mean you should ;)

Relative to raph's recent "ranting", I hope this little package inspires people to play with trust metrics. There are a couple of easy hacks people could do to with this code:

  • Write Ruby, Perl, etc. wrappings (mebbe with SWIG);
  • Liberate the code from the GLib 2.0 dependency;
  • Look at the actual topology of the advogato.org network in a variety of ways;
etc.

Incidentally, it seems like I really do think best in code. This little exercise has given me a bunch of ideas, most of which only popped up once I got a working Python API and it was clear just how easy it would be to implement them...

--titus

Link Madness

Bill Moyers on this administration, the press, and secrecy.

Torture's Long Shadow by Vladimir Bukovsky. Whoo.

And, finally, as an antidote: waaaay too cute.

Trustiness

A spot of hacking tonight produced gratifying results. In Python,

from net_flow import TrustNetwork

capacities = [ 20, 7, 2, 1 ] network = TrustNetwork()

network.add_edge("-", "el_seed") network.add_edge("el_seed", "test1") network.add_edge("test1", "test2") network.add_edge("test3", "test4") network.add_edge("-", "test4")

network.calculate(capacities)

for user in network: print user, '\t', network.is_auth(user)

produces

-               True
el_seed         True
test1           True
test2           True
test3           False
test4           True

which looks more-or-less correct to me; c.f. http://www.advogato.org/trust-metric.html.

The net_flow Python module contains a class wrapper around a hand-written wrapper for mod_virgule's net_flow.c. Using net_flow with a scraped download of the current advogato certification network (sorry, raph...) I can reproduce much of the current list of certified masters. I get 602; the actual list contains 766 members. So obviously I'm still doing something wrong, but probably it's just a matter of reading the mod_virgule source code a bit better.

raph, if you happen to read this, could you verify that the capacities:

capacities = [ 800, 200, 200, 50, 12, 4, 2 ]

and the seeds

network.add_edge("-", "raph")
network.add_edge("-", "miguel")
network.add_edge("-", "federico")
network.add_edge("-", "alan")

are what advogato is currently using, please?

thanks,
--titus

p.s. I'll make net_flow available once I clean it up and validate it a bit more.

16 Dec 2005 (updated 16 Dec 2005 at 19:55 UTC) »
Collaborative Agile Test-Driven Development

Yeah, OK, that's a bit buzzwordy, hey? ;)

In addition to doing a bunch of time-critical experiments, applying for several independent positions, finishing up a consulting job, and trying to stay in shape with running, swimming, and ultimate frisbee, I've been working with Grig on an application for our PyCon tutorial.

I'm in a good mood because we just finished the prototype, which does everything that I want, albeit slowly and badly with an ugly interface.

Our first public release is scheduled for January 5th, at which point we will unveil the application & people can check out the development site and tests.

In the spirit of Grig's post, here are a few observations.

  • Trac does rock. The "view milestone ticket progress, by component" is my favorite view.

  • Being able to bounce ideas off of Grig is fantastic.

  • Being able to assign tasks to Grig via the Trac ticketing system is even more fun, even if it's a bit of a guilty pleasure ;).

  • There are lots of types of testing tools out there, and I don't know if anyone has ever sat down and implemented all of the different types on an open-source project. (e.g. not py.unit and nosetest on the same project, but unit tests, Web tests, performance tests, acceptance tests, log tests, pester-style code-rewriting tests, random form-filling tests, etc.) I doubt it's ever been done for a project this small: I don't think it will be much more than 1000 lines of core application code by the time we're done. The test code will easily outweight the core application by a factor of 2-3, I bet.

  • Prototyping, ticketing, tests, and wikis all work together in an amazingly synergistic fashion.

Of course, now Grig and I have to worry that we'll unveil this app and people will go "ehh" and wonder why we're so excited. O well, that's a risk we'll have to take... those tutorials are non-refundable, right? ;)

Trac and SCGI/WSGI patches -- or, why to use Darcs

It's mildly annoying (and, more to the point, inconvenient to me ;) that the Trac people aren't checking in the WSGI support stuff. Right now I'm running several trac instances off of subversion latest, with the WSGI/SCGI patches in the development directory. I can't check the patches in myself, because I don't have (or want) Trac subversion access; I don't know if it's even possible to set up a patch stream with svn; forking is stupid; and I sure as hell don't want to set up tailor to convert between svn and darcs. What to do? Wait for Trac 2.0, I guess...

(As my advisor would say, "whine, bitch, moan, complain" (add a dismissive hand-wave in your imagination).)

This leads to a separate point: two of the best Python projects of all time, Trac and Mailman, don't use any of the Python Web frameworks, near as I can tell. From this, I conclude that either all Python Web frameworks suck (because we compulsively (re)invent different wheels) or alternatively Python core is 80% of a Web framework in itself (so people can roll their own in less time than it takes to understand an existing one).

Discuss amongst yourselves. (Add another dismissive hand-wave ;).

--titus

Laying eggs

I'm now firmly committed to PJE's setuptools/easy_install. It's invaluable as a way to make precompiled Python distributions available.

As part of our agile development tutorial at PyCon, Grig and I are developing an application. (More on the app anon: our first release is due by the end of the month.) The app depends on Durus and CherryPy, neither of which "just work" with easy_install on Windows.

For Durus, the problem is that it has a binary extension; you need a C compiler to build it.

For CherryPy, there's some issue with SourceForge download page breakage, and maybe something problematic with paths on Windows XP. (I haven't isolated the problem.)

So, I built eggs. I found a colleague upstairs, Brandon King, who had just gotten Windows compilation working for Python packages; after patching in

from setuptools import setup
to the top of setup.py for both packages, 'python setup.py bdist_egg' produced nice, functional eggs.

They are available at

http://issola.caltech.edu/~t/dist/

and can be grabbed with


easy_install -f http://issola.caltech.edu/~t/dist/ Durus

or
easy_install -f http://issola.caltech.edu/~t/dist/ CherryPy

I'm still having some issues installing things on Python-naive boxes (that is, Windows boxes with just a standard install of Python) but that will have to wait for the first release. (FWIW, the problem will probably be fixed by building eggs for the pywin32 code.

PyPi

Now that I've been using PyPi and easy_install for a few weeks, I'd guess that about 80% of packages are directly and immediately installable via easy_install by typing 'easy_install package_name'. I've run across a bunch that aren't, though. Those include Zope3 and CherryPy; zope.testbrowser also had a problem, but I think that was an issue with the '.' in the middle of the name.

I would be very happy if it were possible to install every package on PyPi with easy_install, and it might be a worthwhile project to highlight those that can't, for whatever reason. Hmm, could become part of the Cheesecake project... a list of all the projects that don't work with easy_install, separated into lists of those that are easy_install's fault vs those that are the author's fault. (The latter would, of course, be in bright red with BLINK tags.) Perhaps we could even call the list "Sweaty Cheese" ;).

Another fantastically useful project would be to automatically download and build Windows and Mac OS X eggs for all of the PyPi projects. Hmm, there'd be some security issues, but I bet you could work something out with public keys where only packages authorized by some key authority would be automatically downloaded. Humm.

--titus

7 Dec 2005 (updated 7 Dec 2005 at 19:15 UTC) »
Oblique Strategies

I'm a big fan of Oblique Strategies; so once I found robin parmer's python implementation, I thought why not write a quick Web site for it?

Clearly I need to spend more time on work. But it was so quick and easy... ;)

Speaking of which...

It's just too easy

The discussion on Aaron Swartz's blog about rewriting reddit & web.py illustrates a few amusing points about Python. Apart from the downright absurdity of some of the discussion so far -- general Lisp snarkiness, and Aaron's assertion that all bajillion Python Web frameworks suck (except for his, which isn't available yet...) -- I think a few truths emerge.

The main truth is that it's clearly too easy to write your own Web framework in Python. It's less work to code a few hundred lines of Python than it is to understand someone else's few hundred lines of Python; it's also easier to continue thinking like you already do than it is to adapt your thinking to someone else's API. And, most important of all, a few hundred lines of Python is really all you need for a fully functional Web app framework. Thus, our massive proliferation of Web frameworks. (As Harald Massa writes: Python: the only language with more Web frameworks than keywords</a>.)

Clearly, the only way to cut down on the number of Web frameworks is to make it much harder to write them. If Guido were really going to help resolve this Web frameworks mess, he'd make Python a much uglier and less powerful language. I can't help but feel that that's the wrong way to go ;).

Another truth that I stumbled over yesterday: it's much harder to write good, clean, maintainable, documented, tested code than it is to write a functional Web framework. Partly this is a matter of withstanding the test of time; partly it's a matter of development practices. If there's one thing I'd like to explore for my own projects, it's how to keep tests, documentation, and code all in sync. (Want to know how to do it? Come to our tutorial!)

I think the test for the future will be simple survival; this will be based on things like documentation more than on functionality. For example, Quixote, though powerful, suffers from poor documentation. CherryPy, which enforces a similar coding approach on apps, has an attractive, busy Web site. Getting started with CherryPy is simple; getting started with Quixote is not so simple. This really matters in the long run.

Picking a Web framework

The above thoughts occurred partly in the context of my own choice of frameworks for a new project.

I'm starting a new project for our Agile Development and Testing tutorial at PyCon, and I wanted to try something other than Quixote (just for novelty's sake, ya know?). My rough requirements were: must run on Windows; must not commit me to a database other than Durus, by e.g. requiring SQLObject; shouldn't be a huge framework (I'd like to distribute the entire thing in an egg); and needn't (shouldn't) be a full content-management system. So I took a look at QP, browsed around Django and TurboGears, visited web.py, and settled on CherryPy. (Actually, I started

on CherryPy and then discovered that their introductory "hello, world" example didn't work with their latest stable release, 2.1.0, and gave up in frustration. Then I came back to it after striking out with the others.)

So, why CherryPy?

Django and TurboGears are too much. Django is more of a CMS, and TurboGears commits me to too many packages.

QP is largely undocumented and doesn't run on Windows. (The former is a real problem: I spent 15 minutes trying to figure out how to run QP on something other than localhost, and couldn't manage.)

web.py? Not available yet. Maybe it will be, maybe it won't be... but it's awfully hard for me to evaluate a package based on 50 lines of documentation and no code, although I applaud the attitude.

CherryPy is just what I need: lightweight, fairly obvious code on my side, nothing complex required. We'll see how I feel in a week or two ;). I've already broken it once or twice, and I think the internal code is more complex than it needs to be (based solely on ~30 minutes of browsing around to fix the breakage) but I needed to commit to something before Grig keel hauls me... and CherryPy is clearly the best-suited of the things I looked at.

And it's got a really attractive Web site...

In other news...

Commentary

is one of the coolest things to cross my radar screen in a while. (It's an AJAX-y way of adding comments to Web sites.)

I think there's a really good application somewhere in here; something combining Raph's trust metric stuff with Commentary to make a community-wide commenting/annotation system. (I put some work into such a thing earlier this year, but decided to focus on twill first.) I hope someone else writes it so I don't have to ;).

--titus

It's just too easy

The discussion on Aaron Swartz's blog about rewriting reddit & web.py illustrates a few amusing points about Python. Apart from the downright absurdity of some of the discussion so far -- general Lisp snarkiness, and Aaron's assertion that all bajillion Python Web frameworks suck (except for his, which isn't available yet...) -- I think a few truths emerge.

The main truth is that it's clearly too easy to write your own Web framework in Python. It's less work to code a few hundred lines of Python than it is to understand someone else's few hundred lines of Python; it's also easier to continue thinking like you already do than it is to adapt your thinking to someone else's API. And, most important of all, a few hundred lines of Python is really all you need for a fully functional Web app framework. Thus, our massive proliferation of Web frameworks. (As Harald Massa writes: Python: the only language with more Web frameworks than keywords</a>.)

Clearly, the only way to cut down on the number of Web frameworks is to make it much harder to write them. If Guido were really going to help resolve this Web frameworks mess, he'd make Python a much uglier and less powerful language. I can't help but feel that that's the wrong way to go ;).

Another truth that I stumbled over yesterday: it's much harder to write good, clean, maintainable, documented, tested code than it is to write a functional Web framework. Partly this is a matter of withstanding the test of time; partly it's a matter of development practices. If there's one thing I'd like to explore for my own projects, it's how to keep tests, documentation, and code all in sync. (Want to know how to do it? Come to our tutorial!)

I think the test for the future will be simple survival; this will be based on things like documentation more than on functionality. For example, Quixote, though powerful, suffers from poor documentation. CherryPy, which enforces a similar coding approach on apps, has an attractive, busy Web site. Getting started with CherryPy is simple; getting started with Quixote is not so simple. This really matters in the long run.

Picking a Web framework

The above thoughts occurred partly in the context of my own choice of frameworks for a new project.

I'm starting a new project for our Agile Development and Testing tutorial at PyCon, and I wanted to try something other than Quixote (just for novelty's sake, ya know?). My rough requirements were: must run on Windows; must not commit me to a database other than Durus, by e.g. requiring SQLObject; shouldn't be a huge framework (I'd like to distribute the entire thing in an egg); and needn't (shouldn't) be a full content-management system. So I took a look at QP, browsed around Django and TurboGears, visited web.py, and settled on CherryPy. (Actually, I started

on CherryPy and then discovered that their introductory "hello, world" example didn't work with their latest stable release, 2.1.0, and gave up in frustration. Then I came back to it after striking out with the others.)

So, why CherryPy?

Django and TurboGears are too much. Django is more of a CMS, and TurboGears commits me to too many packages.

QP is largely undocumented and doesn't run on Windows. (The former is a real problem: I spent 15 minutes trying to figure out how to run QP on something other than localhost, and couldn't manage.)

web.py? Not available yet. Maybe it will be, maybe it won't be... but it's awfully hard for me to evaluate a package based on 50 lines of documentation and no code, although I applaud the attitude.

CherryPy is just what I need: lightweight, fairly obvious code on my side, nothing complex required. We'll see how I feel in a week or two ;). I've already broken it once or twice, and I think the internal code is more complex than it needs to be (based solely on ~30 minutes of browsing around to fix the breakage) but I needed to commit to something before Grig keel hauls me... and CherryPy is clearly the best-suited of the things I looked at.

And it's got a really attractive Web site...

In other news...

Commentary

is one of the coolest things to cross my radar screen in a while. (It's an AJAX-y way of adding comments to Web sites.)

I think there's a really good application somewhere in here; something combining Raph's trust metric stuff with Commentary to make a community-wide commenting/annotation system. (I put some work into such a thing earlier this year, but decided to focus on twill first.) I hope someone else writes it so I don't have to ;).

--titus

socal-piggies dinner meet

Anyone in the LA area who is interested in Malaysian food & Python conversation on Wednesday, contact me soon -- I'm confirming reservations for Kuala Lumpur in Pasadena, at 7:30pm. (So far we have 11 people!)

--titus

5 Dec 2005 (updated 5 Dec 2005 at 18:44 UTC) »
Two interesting interviews.

How BioWare Makes Game Communities Work

An Interview with Scott Bakker

The Lone Software Developer's Methodology

Entertaining and interesting rant (with explanation).

When unit testers fail...

At some point over the last week, my nose unit tests for twill started flailing. Yes, not just failing, but flailing. There was so much output that it was tough to figure out exactly what was going on. I spent a few minutes across a few days trying to figure out what had changed; I'm pretty sure it was due to a nose version upgrade, now, but I could never actually figure out what version of nose recognized all my tests *and* still worked.

Whatever the proximal cause, it turns out I designed my unit tests incorrectly. The tests are in modules (separated in individual files), with 'setup', 'test', and 'teardown' functions in each module. The latest version(s) of nose was discovering the 'setup' and 'test' functions and running (as near as I can tell) 'setup' twice, 'test' once, and 'teardown' zero times.

I'm still not sure if this is a bug; I couldn't figure out what the behavior of nose should be, because it's not explicitly documented on the nose page. (Maybe it's on the py.test page?)

Finally, I discovered 'setup_module' and 'teardown_module' and renamed setup and teardown in all my tests. That solved my problems.

I also learned (from the nose documentation) that you could specify a test collector in setup.py when using setuptools. So now 'python setup.py test' will run all the tests.

The shocker for me in all of this was how uncomfortable I felt even thinking about modifying twill without the unit test framework. Even though I could execute the tests individually, I knew that I wouldn't run them nearly as frequently as I do when 'nose' is working. Automated tests are a really important security blanket now...

What users want, and what they'll get.

So, it appears that an increasingly common use case for twill is Web browsing.

When I started developing twill, I intended to develop a replacement for PBP. PBP was aimed at Web testing, and that's what I needed myself. Roughly, this meant:

  • a simple scripting language, with an interactive shell;
  • Functional cookies/formfilling/etc, so that interacting with a Web site actually worked;
  • a fair amount of assert stuff, like "code" (to make assertions about return codes) and "find" (to make assertions about presence of specific text/regexps).

Then I made a few design choices: to use python and python's Cmd for the interactive shell, and to use a metaclass to automagically import all twill.commands functions into the interactive shell. This meant that all of the functions accessible from twill were also directly accessible via Python, and the twill language docs functioned equally well as Python language docs.

Thus, twill could be used from Python code to do most Web browsing tasks that didn't involve JavaScript.

I was happy with this result, but it was largely unintended. I mostly use twill via the scripting language.

However, the early press on twill was from people like Grig and Michele, who talked glowingly about the ability to use twill from Python. This has led to people wanting more functionality: especially, sensible return values from twill commands. This, in turn, has led to a bit of caviling on my part about this direction, because I haven't really thought it through.

Anyway, the upshot is that I have to rethink my priorities for twill a bit. I was going to focus on recording functionality and extension management for the next release, but it seems like I should also work on simplifying and documenting the browser interaction code. Given the one-to-one mapping between twill.commands and the scripting language, I don't want to add things like return values and Python-specific documentation to the commands; perhaps I can satisfy people with a browser-object interface...

The big surprise for me -- and it really shouldn't have been a surprise -- is that people really seem to want to do Web browsing via a trivially simple interface: go, follow, submit, and get_page contain 90% of the desired functionality. mechanize and mechanoid are serious overkill for this level of Web browsing, and the fact that twill puts a high priority on "just working" (albeit at the expense of customizability) probably helps contribute to users' interest.

The simplest route for me to go is probably to work on two applications I've been planning: twill-crawler, and mailman-api. (Presumably the names give away the function ;).) Then I'll have some idea of what people need for Web browsing; right now I'm feeling a bit underplanned, so to speak.

--titus

Review of "Endless Forms Most Beautiful"

A review of "Endless Forms Most Beautiful" by Sean Carroll. (This is a book on evolutionary developmental biology, which is one of the things my lab works on.)

In-process WSGI testing

Reached a stable point with a little side project: wsgi_intercept. This is a broken-out version of my in-process WSGI testing stuff

that works for all of the Python Web testing tools I could find. Specifically, I monkey-patched webtest and webunit; provided a urllib2-compatible handler; and subclassed the mechanoid, mechanize, and zope.testbrowser Browser classes.

I ran into some minor but nonetheless annoying problems along the way. For example, Zope (which is needed for parts of zope.testbrowser) cannot be installed with "easy_install zope"; the PyPi page doesn't link to a download page, and even when downloaded, Zope calls 'setup.py' 'install.py' instead. This confuses easy_install.

I also couldn't figure out contact info for either Zope or CherryPy (the owners of webtest). I didn't look terribly hard, but the PyPi contact e-mail for Zope is a subscriber's-only list, and <team at cherryp.org> (which is the e-mail address at the top of webtest.py) doesn't exist. CherryPy folks -- someone, please contact me if you want the patches to webtest (or just grab them from the wsgi_intercept code yourself, of course!)

And easy_install seems to be confused by packages with '.' in their names; zope.testbrowser doesn't install with just an 'easy_install zope.testbrowser'. (Spoiled, aren't I, to expect it all to work so easily!)

But these are only minor gripes. On the whole, the packages I downloaded and modified had nice, clean source code. I think there's something about people who write testing tools that leads them to clean up their code ;).

A few quick off-the-cuff opinions, while I'm at it:

  • zope.testbrowser is indeed a simple, clean interface to mechanize. (Python2.4 only, however.)

  • I like the way mechanoid (a fork of mechanize) has broken out the class structure of mechanize into files.

  • If I needed a minimal Web testing tool, I'd use webtest.

  • funkload (based on webunit) looks pretty neat. (In the Python world, it's probably the major competitor to twill; they're focusing a bit more on load-testing, though.)

  • people should use ClientCookie rather than urllib2, I think. It does more, and it's written by the same person ;).

  • One of mechanize's big problems is its retention of backwards compatibility. John seems intent on keeping python 2.2 and up working in mechanize; I think that complicates the code more than it should.

Anyhoo, g'nite.

--titus

131 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!