Older blog entries for titus (starting at number 181)

6 Jun 2006 (updated 6 Jun 2006 at 16:14 UTC) »
Robert Jordan is ill

For some reason I hadn't heard this yet -- Robert Jordan has amyloidosis!

pinocchio oops

Max discovered pinocchio, my package of nose extensions, and then pointed out that I hadn't put any version info or contact info on my page for

pinocchio.

Oops. Fixed. Sorry 'bout that!

Code coverage

I've been using Ned Batchelder's code coverage module for a while now, and it's been great. We used a slightly hacked version for the agile testing tutorial, and now I need to do even more hacking on it.

I decided that rather than serially refactoring the code I'd swipe a few of the clever bits and do a complete rewrite. This effectively makes it a complete fork. I decided upon this tack because in my previous hacking I spent a lot of time struggling with the basic design of the module, and while the clever bits are pretty isolated and portable, the rest -- path munging, option handling, etc. -- is what I want to change in the first place.

Of course, immediately after deciding to steal some of the code, I ended up rewriting most of it. Sigh.

One of the main clever bits in coverage.py was the AST traversal code that decided which statements were potentially executable; this section used the compiler module. I'd heard somewhere that this module was deprecated, or unreliable, so I looked for some alternatives.

I put in some work on it last night, and arrived at the following function to extract interesting lines of code using tokenize:

class _TokeneaterObj:
    def __init__(self):
        self.lines = sets.Set()
        self.start_line = None
        self.ignore = (tokenize.COMMENT, token.NEWLINE,  token.INDENT,
                       token.DEDENT, token.ENDMARKER, tokenize.NL,
		       token.STRING)

def tokeneater(self, *a): token_type, s, (srow, scol), (erow, ecol), logical_line = a

if token_type == token.NEWLINE: if self.start_line is not None: self.lines.add(self.start_line) self.start_line = None

elif token_type not in self.ignore: if self.start_line is None: self.start_line = srow

def get_lines(fp): t = _TokeneaterObj() tokenize.tokenize(fp.readline, t.tokeneater) return t.lines

I don't know if this will be a good choice, long term. I have to write some tests... Any better ideas? (Let me know.)

My goals in this rewrite are a better interface for large projects & simplified filename handling. Switching to using sets and tokenize may be simple side-benefits, or perhaps costly diversions ;).

And then, the eternal dilemma -- what should I call it? Grig, my inane bozo of a friend, suggested 'figleaf'. I like it. (Runners up were 'blanket' (Diane) and 'wet blanket' (me).)

--titus

twill 0.8.5 released

I finally released a new version of twill, the Web scripting & testing language. This one is pretty solid, as far as I can tell; relatively few bug reports over a 2 month period. Between now and the (fairly distant) 0.9 beta release, I expect to make a number of changes to the underlying implementation, but I think the API and command-line usage is pretty stable.

In other twill news, I've moved the twill Web site to twill.idyll.org, and created a Trac site at twill.idyll.org/trac/. Check out the twill 0.9 milestone! The Trac site is intended for Wiki info, tickets, and milestones related to not only twill but also scotch and wsgi_intercept. It's not linked into the source code for the three projects because they're all in separate darcs repositories.

Let them eat Kwalitee!

I'm a fan of Grig's Cheesecake project, not least because we're both SoCal Piggies. The project is one that aims to provide a single score representing how well a Python project is packaged. He's gotten some interestingly negative comments about the project as part of the Google SoC wrangling, and I feel obliged to comment on them.

The two comments that I disagree with the most are these: first, that this will lead to an era of pseudo-fascism on PyPi, with people 'endlessly tweaking' their Python package to get a better Cheesecake score; and second, that unit testing is not applicable to a fairly large subset of the projects out there.

In response to the first comment: I just don't see it happening. I do expect many projects to work a bit to providie a README, a working setup.py, and the various other files. Perhaps they'll even toss in some unit tests. That's all to the good; right now I'm not aware of a single place that documents what should go in a Python package, and if you install a lot of Python software you probably believe that there should be. (Grig?) Nonetheless, just the act of attaching a score to a package isn't going to make people devote an excessive amount of time to raising that score. Perhaps if Grig was giving out ice cream to the top 10 percent of the scores -- but he's not.

In response to the second comment, there's a widespread misunderstanding about unit tests that seems to crop up when people first implement them. Unit tests are not about anything external to your code. They're all about making sure that your code works, and that your

code stays working. As soon as you start talking about unit testing graphics, or video, or the Web, or your database API, you're actually shifting to discuss what are known as "functional" or "integration" tests. These can run in a unit test framework, but they are not "unit tests". (If you think I'm trying to redefine "unit test", go read Kent Beck's original writings on this stuff.) So in practice all code is unit testable, and I'd be willing to be that over 95% of the packages that exist could have useful unit tests.

Anyway, that's my 2 cents -- exactly what my opinion is worth ;).

Miscellany

richdawe mentions an SMTP shell (e.g. twill with an SMTP extension ;), and

then there's TestableEmailer. Cool stuff.

cinamod, sounds like you're having too much fun. Let me know if you show up in LA sometime and want to check out the pickup scene here.

--titus

nose

I've spent much of the last week arguing with nose, Jason Pellerin's unit testing framework for Python.

The fruits of that labor?

First, an extended introductory article and associated demo code, introducing, demonstrating, and discussing many nose features. (It's still a bit of a rough draft, folk. Send comments.)

Second, the pinocchio project. Yep, nose extensions. (Aren't I cute?) This adds 'stopwatch' and 'decorator' extensions to nose.

Putting wsgiref in the stdlib

Ever wonder how untested modules with non-standard interfaces and little documentation get into the stdlib?

Wonder no more.

Pursuant to Guido's proposal, I took a look at wsgiref. I found some bugs, and asked a few questions.

Ian Bicking pointed out another issue.

No response from PJE.

Today I noticed that the bug I pointed out had been fixed. Neither of the e-mails were answered.

I posted publicly, I sent a private e-mail. What more should I do? I only got irritated when I saw the checkin fixing the bug without any acknowledgement of the other issues raised. <snark>Well, I guess once you've got Guido's OK, you don't need to listen to anyone else, right?</snark>

I'd be less irritated if the barrier to fixing problems once modules are in the stdlib wasn't so high. wsgiref will become effectively immutable -- overcomplicated constructor and all -- once it's integrated. That's presumably why GvR asked for comments, yeh?

Oh, well. Phillip -- if you actually want any contributions from me for wsgiref, you're going to need to answer my questions. I don't fancy writing documentation for an interface that could change, and I won't exactly enjoy bug testing your code in the future if I'm going to get the silent treatment for having the temerity to ask questions. (I'm not bucking for an apology here -- there's nothing to apologize for. Just be a member of the community, please.)

JohnCompanies rocks

My hosting provider, JohnCompanies, has taken an explicit stand on warrants, spying, etc. They're also opening up offshore rsync.net machines, although no offshore hosting is going to happen yet.

A bunch o' miscellaneous links

Ruby gains a mechanize implementation.

An artist on acid.

No new comments on scotch, but I did locate and write down

a bunch of other python-relevant HTTP recorders & proxies. Pound is particularly interesting.

apenwarr has an interesting parable on testing.

Pretty pictures of stuff relevant to my dad's work.

Busking and educating the police about the law. Priceless.

Domain Specific languages rock. Even in Ruby ;) ;).

autotest. Apart from the amusing quote about "not needing to open a Web browser to test" -- well, first of all, neither do I, and second of all, how much do you want to bet your site doesn't actually work the way you think it does? -- this autotest phenom sounds interesting. (py.test supports similar behavior.) Might be time to hack it into nose...

First on the list of things I didn't think would ever work -- crowdsourcing R&D!? Way cool.

--titus

19 May 2006 (updated 19 May 2006 at 17:43 UTC) »
haruspex, yeah, I know that the OS X support for network drives is better -- and that may actually be the technical reason behind not supporting FTP that well. Still, it's freakin' annoying when you're trying to set up an anonymous submissions directory; FTP works great on Windows, but is bollocks on OS X for unskilled users.

scotch, a WSGI-based HTTP recording proxy

I finally wrote up some preliminary docs on scotch, a project I first wrote about yesterday. scotch is my solution for recording twill scripts, as well as tracking AJAX Web calls and doing general Web site regression testing. The scotch examples page is probably the place to start, although the front page is more conversational. There are also some simple simple code recipes that demonstrate the potential. (You can grab scotch at the usual place.)

I had a nice e-mail conversation with Ben Bangert about the possibility of using scotch for more clever twill script making. It's always nice to have people grok the tool you just wrote ;).

Paranthetically, it'd be interesting to see if psiphon functionality could be broken out into WSGI middleware playing on top of scotch.proxy. (Here's an article about psiphon</a>.)

all about twisted

Glyph Lefkowitz posted a link to this interesting paper on twisted. Old paper, but still good, I think.

--titus

A week or so of links

Debugging: Essential Technological Literacy, via CoolTools; a good-looking book that most programmers should read.

Driving Rails from the command line. I'd write something to do this with WSGI, but I think I already have.

So Close, Yet So Far -- (not) publishing well online.

The Top 10 OSS Games You've Never Played

A process diagram for arguing about Intelligent Design.

Mac OS X and FTP

So, let me get this straight -- you can easily mount a WebDAV folder on your OS X desktop, and you can just as easily mount an FTP site on your desktop. The difference, of course, is that the FTP site can only be mounted read-only! Why? I can only guess... it doesn't seem like it'd be that difficult to do technically. Hell, it works in Windows...

In other news, a lot of OS X FTP GUIs suck. FTP Thingy didn't do a very good job of dealing with anonymous uploads... I recommend Cyberduck if you need a drag&drop FTP solution.

(Fugu is the way to go if you need to give an Apple user an SCP/SFTP interface.)

On Web services...

From an excellent ACMQueue interview with interview with Werner Vogels, Amazon's CTO:

Do we see that customers who develop applications using AWS care about REST or SOAP? Absolutely not! A small group of REST evangelists continue to use the Amazon Web Services numbers to drive that distinction, but we find that developers really just want to build their applications using the easiest toolkit they can find. They are not interested in what goes on the wire or how request URLs get constructed; they just want to build their applications.

...and AJAX

An interesting, if a bit overly self-congratulatory article on AJAX-iness by Joel Spolsky.

Musings on the nature of OSS "marketing"

So I've been working on a fairly kick-ass piece of software I'm calling scotch, which contains a Web recorder/player (scotch.recorder) implemented as WSGI middleware and a Web proxy (scotch.proxy) implemented as a WSGI app. This lets you do obviously cool things like set up a recording proxy server, or do regression testing by recording Web sessions and then playing them back and comparing "now" responses with the "then" responses, or grok precisely how asynchronous HTTP transactions (aka AJAX) are being used by a site. (I've already used it for all three.) I've got a primitive twill translator written for it, and I'm thinking about writing a translator for Selenium, too. The architecture I've chosen -- decoupled WSGI objects -- is very nice because it allows me to chain things arbitrarily; I want to eventually add a cookie cleaner, a reverse proxy system, and an anonymizer, all of which could be swapped in and out as WSGI middleware. Heck, you could even use this as the basis for an "offline" browsing cache, or write a Web frontend that lets you browse through the recording as it's happening, or ... well, let's just say that a lot of tomfoolery could happen. It isn't even Python limited, because of the proxy app -- that can interact with anything in any language, as long as it speaketh HTTP.

My current source of cognitive dissonance here is that I just don't know how to start writing it up, or where to target my development. (I also don't seem to have any time these days, but that's not a problem that can be solved by anyone else. ;) How should I best apply my limited time?

Here are a couple of possible strategies:

  • post what I have, and blog about all the neat, nifty, or just plain cool things I'm doing with it. Hope people care.

  • develop out a few specific applications -- regression testing and twill translator are two obvious ones -- and write articles about using scotch to solve real problems. Hope people care.

  • figure out what other people want to use it for, and let them proselytize it while I introvert on the architecture and coding. Assume people care.

  • start a company, and ... naaaaaah, never mind ;)

All of this boils down to thinking about "marketing", albeit in the OSS world rather than in the world of $$. And in that world I have reached a simple conclusion: marketing doesn't matter, at least at the scale I can do. My two biggest OSS "successes" so far are Cartwheel (used by 100s of biologists) and twill. Neither of these projects was really "marketed" by me; they were immediately useful to other people, who picked 'em up and ran with them. In a badly mixed metaphor, I'm just trying to stay on the horse -- and I really can't steer it very well.

Back to scotch. I think my best bet is to do two things up front: document as well as possible, and release a solid if not well-rounded implementation. If people find it useful, they'll use it; if not, not. It's the "market" at work...

--titus

Martin Fowler on Ruby

Evaluating Ruby says nothing hugely new, but does give the whole RoR combo a thumbs up from his ThoughtWorks-y perspective.

Malone

Yesterday, I wrote about tracking bugs across bugzilla. Stephen Thorne pointed me towards Malone, a Canonical system for doing just that.

fight!

fight fightfight! ;)

...and someone else weighs in.

In other advogato news, bi calls advogato "elitist". Uhh... yeah, sure, if by "elitist" you mean "we use an explicitly simplistic ranking system to keep spam out". We have incredibly low levels of spam on advogato, and I have to say I don't find the ranking system all that elitist. More exclusionary, if anything.

--titus

For future reference

The slamd distributed load generation engine.

mocks considered harmful?

robertc

asks if mock objects lead to interface skew.

I could probably go off on a long polemic here, but let's see if I can keep it short... Basically, if all you're doing is testing with mocks, then the answer is probably that you're going to screw yourself up.

But let's generalize the question, to see if perhaps there's a deeper truth:

Dear Mr. Touchstone,

I'm using technique X for testing, and I'm only doing unit testing (...or functional testing, or acceptance testing, or integration testing, or regression testing, or UI testing, or smoke testing, or fizzlebar testing...) I'm worried that this is going to cause problems in the future. What should I do?

-- Anxious in Albuquerque

...and my answer:

Dear Anxious,

I think you should consider testing at additional levels. Generally unit testing is not sufficient, because it purposely tests only small, largely independent units of the code. Functional and smoke tests (as in "push button -- does smoke appear?") should be de rigeur for any project; IMO they're as or more important than unit tests.

So that's my answer: there are bigger problems lying in wait! Mock objects are a specific solution to a fairly general problem with unit tests: you can mimic external interfaces that need to interact with your code in fairly stereotyped ways. That stereotyping is a specific weakness of the approach, as well as being the strength that led you there in the first place.

More generally, any time you find yourself relying completely on your unit tests, you're going to be in trouble. Unit tests are a great programmer tool, and they are very useful, but they are not everything.

Someone interviewed me

Pythonthreads just posted a lengthy interview with me. It's been a month or so since I answered the questions, and I'm still fairly happy with my answers!

Bugs

I read an interesting post a few days ago that discussed the dismal situation with bugs in Linux packages: basically, there's been a proliferation of bugzillas, and you can often no longer have the conversation necessary to fix a bug in any one of the bugzilla sites.

It occurs to me that one possible solution is to build a test case that demonstrates the bug; that's sort of the maximally portable bit of problem documentation necessary, no? Then just post that to the most upstream bugzilla...

--titus

GEICO

I went to GEICO's Web site today to pay our car insurance, and apparently hit a bad URL (retrieved from old e-mail). The error page came up with a Tolkien quote: "Not all who wander are lost." How cool is that?

Mango Sauce

The latest in Google's AdSense misadventures: Mango Sauce banned from adsense. Going by this page, what we have here is an arbitrary decision being made by a lower-level functionary at Google. Read the page & send in your protest...

The mango sauce page itself is pretty interesting. Very racy, but quite entertaining.

Webstemmer

This looks cool. Someone should integrate it with twill.

Deleting spam sent to moderatedSourceForge mailing lists

Problem: Sourceforge runs an old version of mailman that doesn't have the nifty new "discard all" button.

Solution: a twill script to automate the process of logging in & clicking "discard" for each message.

(I've mentioned this before, I think, but it's gotten easier with some of the latest additions to twill.)

so:

...
The PyWX-discuss@lists.sourceforge.net mailing list has 863 request(s)
waiting for your consideration at:

https://lists.sourceforge.net/lists/admindb/pywx-discuss

Please attend to this at your earliest convenience. This notice of pending requests, if any, will be sent out daily. ...

thus:

% ./twill-sh examples/discard-sf-mailman-msgs -- pywx-discuss
>> EXECUTING FILE examples/discard-sf-mailman-msgs
==> at https://lists.sourceforge.net/lists/admindb/pywx-discuss
Enter list password:
Note: submit is using submit button: name="request_login", value="Let me in..."
-- matches 220
set 220 values total
Note: submit is using submit button: name="submit", value="Submit All Data"
--
1 of 1 files SUCCEEDED.

(wipes hands together in a "done" gesture)

I don't grok HTTP very good

I spent a lot of time this weekend hacking together two WSGI middleware apps: first, a recorder/playback system, which lets me both record and play back all traffic into and out of a given WSGI application; and second, a WSGI-based transparent proxy app. The ultimate goal, of course, is to enable recording of all Web traffic much like TCPWatch, but in a nice modular WSGI way.

So, the recorder works: I can now place a middleware wrapper around any app, save & pickle everything that goes into and out of the app, and then play it back. That's pretty neat.

The proxy doesn't work. Or, rather, it sort of works: I can browse Google Search and go to many (but not all) Web sites. There is, however, a bug somewhere: none of my Trac sites work, and AJAX stuff seems broken, too. I'm guessing it's in my broken dealing with HTTP 1.1 features; that, or WSGI and proxying just don't go together. (Also plausible, 'cause I'm breaking a number of PEP rules here.)

Depressing. But I'm sure I'm almost there.

(People interested in wading through some really cruddy code can e-mail me to get a copy. But bring your knee-high waders, 'cause it's bad.)

--titus

Two new articles

I've put up two new articles -- What is WSGI? An Introduction and Testing WSGI applications with twill -- a (very) brief intro. Both were written for today's SoCal PIGgies meeting.

In the article on testing WSGI applications with twill, I have a doctest example that pretty stubbornly won't work. I hereby publicly appeal to Grig to get it to work ;).

Legal issues with "releasing software into the wild"

A useful reference.

And the moral of the story is...

In this very interesting piece on a company moving to Linux, the lesson was: test. The company had the confidence to switch deployment platforms -- and not just once but twice -- largely because they had a complete testing setup. Or at least that's what I took away from it ;).

--titus

Various miscellany...

MIT swipes Caltech cannon; then hits below the belt!

http://web.mit.edu/ec/www/cannoncoeds/:

"""
"I say take 'em back with the cannon.  In fact, forget the cannon,"
remarked senior Jeff Phillips.  Those who have been here for a shorter
period of time reacted differently.  Those quotes are not printable.
"""

Heh.

Fluxus

My sister pointed out this "fluxus" thing. Check it out [pdf]. Not sure what to make of it. To quote,

A piano is lifted by means of a windlass to the height of 2 meters and then
dropped.  This is repeated until the piano or the floor is destroyed.

I think it's a technique that can be used to shatter preconceptions about art and the role of the audience.

Whooooooooooaaaaaaaaaaaaa.

agile-testing

I've been enjoying the agile testing mailing list; here's an interesting post where Michael Bolton (the famous tester, not the famous singer) discusses the so-called 'flat cost curve of change' due to agile methods.

avriettea = ennui+belligerent

avriettea, if I never entered the business because I knew in advance that I wouldn't be happy, does that mean that I'm still a defeatist? (That's why I'm in academics, frankly; it keeps me happier, long term, than programming. Bear in mind that science is as or more difficult than most computer jobs, we're just not paid so well.)

Anyway, it's an interesting idea, but I don't understand why sticking with something that you hate simply because leaving would be "weak" is a good path. (Not having anywhere else to go -- that's a different story.)

And yes, it worries me that you write posts like this and talk about playing with large guns ;).

--titus

172 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!