Testing Web sites
Revisited Cory Dodt's Python Browser Poseur (PBP) today. This is one of those projects that frequently pops into my head as something worth investigating, but I've never actually looked at it seriously. (And the last time I looked, there were still some fairly obvious broken bits that prevented me from making use of it -- but there's a new version...)
PBP is the best (simplest + easiest) way I've seen to test dynamic Web sites. It's based on mechanize, a Python version of WWW::Mechanize, and it provides a simple scripting ability to automate Web site browsing. Even someone without extensive Python experience can write scripts for it, which is an advantage for groups that aren't all programmers. I haven't tried extending it but I doubt it's that difficult; the package code looks clean & is relatively short.
PBP is relatively simple, at least on the surface: here's the example script from their site.
go http://mailinator.com code 200 find "property of Outsc.*me" showform formvalue search email pbp.berlios.be submit search code 200 find "NO MESSAGES"
When executed by pbpscript, this script goes to mailinator.com, searches for the regexp "Outsc.*me" (which matches "...is property of Outscheme, Inc"), and then checks e-mail for the pbp.berlios.be@mailinator.com e-mail address. If there are any messages, the script fails. (Try changing '.be' to '.de' if you want to see -- sorry, I screwed up the example on the Web page by stupidly sending e-mail to pbp.berlios.de@mailinator.com.)
This is cool.
I'm puzzled that neither mechanize nor PBP are better known (as in, I haven't seen it mentioned anywhere but in c.l.p.announce on the occasion of a new release). I don't monitor freshmeat, which is another place it's been posted. Apart from that a google search doesn't turn up much mention. Am I missing a wealth of similar software that is better? What do people use to test Web sites, anyway? A currently-defunct list archive has a reference to HttpUnit, which is a nice-looking Java framework. Unfortunately I doubt it's as Python-extensible as PBP ;). John Lee of mechanize also points out webunit, by Richard Jones (also author of Roundup). I may have to take a look at that. Anything else?
In the interests of exercising PBP a bit, I wrote a simple PBP script (note: transient link) to run through my WSGI adapter interface for the Quixote demo. You can try it out if you want; the site just runs quixote.demo through a CGI-->WSGI -->QWIP bridge. And yes, it's veeeeeeery slow.
I ran into only one real problem with PBP: HTML encoded form values. In the Quixote widget demo, there's a select widget that takes pizza sizes with inch units, e.g. 'Medium (10")'. The mechanize ClientForm is returning this in HTML-encoded form, 'Medium (10")', and PBP demands that it be set to this value. However, Quixote barfs on this because it is expecting 'Medium (10")' -- which is in fact what Quixote sees from browsers. There may be some invisible layers of encoding/decoding going on; Quixote uses cgi.FieldStorage which presumably decodes a properly-encoded string from the browser. I think the appropriate thing to do here is to change mechanize's behavior, but I will ask Cory what he thinks first; I haven't dealt with this aspect of HTML forms before, having been spoiled by nice libraries ;).
Next I'll have to try extending PBP from Python & vice versa. Anon...
--titus
"Consider the situation of two trauma surgeons arriving at an accident scene. The patient is bleeding profusely. If surgeons were like programmers, they'd leave the patient to bleed out in order to have a really satisfying argument over the merits of two different kinds of tourniquet." -- Philip Greenspun.
