Older blog entries for titus (starting at number 113)


Need to transfer between multiple version control systems? Try Lele Gaifax's tailor. With tailor, you can do things like convert CVS repositories into Subversion or darcs repositories. It's particularly useful for taking source from a non-distributed ("archaic" ;) source control system that you don't control and putting it under darcs. Imagine: your own version of Python, with your own patches kept under darcs control, constantly kept updated with the very latest Python patches. You could swap patches with other people, have different Python distros for different tasks, and lots of other things ... what a world!

simplicidade has good things to say about tailor, too.

epydoc and reStructured Text

While setting up my biolib project, I spent some time trying out code documentation generators. I discovered that epydoc groks reStructuredText, and was sold on it, especially once I figured out that I could link directly to the non-frames version. (The frames version clutters up the screen somethin' fearful...)

Today I added epydoc-generated documentation to session2, our replacement session management system for Quixote 2.x. I'm very happy with epydoc.

I might have used effbot's PythonDoc, which produces attractive and simple-looking Web pages, but it requires more specialized markup. Bleah. O well.

And pydoc and HappyDoc just don't measure up in terms of prettiness, sad to say; I used to use HappyDoc for some projects, but epydoc is muuuch nicer looking.

Quixote fixes

The current widgets.txt hasn't been updated since v1.x, so I did some work to fix the obvious errors. It's a pretty long, nasty document; it might be worth automating aspects of it, or just producing epydoc documentation for it...

Also found & fixed (I think) a bug in PTL's cimport.c that would allow imports of non-existent stuff to work, i.e.

from existing.somewhere import bogus

would set 'bogus' to None rather than raising an Import error.

Release: session2 v0.6, flexible persistent session management for Quixote 2.x

w/Mike Orr

session2 is our rewrite of Quixote's session management classes. It contains a new session manager and session class. It also contains persistent session stores for the MySQL, Durus, PostgreSQL databases, as well as session stores built around files in a directory and shelve databases.

Two major new things with this release, v0.6:

You can download it directly, visit the Cheese Shop entry, or browse the source.


Astropix-of-the-day score

The neutron-star binary APOD references work done by me da.

And, on today's APOD, you can clearly see earthshine, which I worked on for a few years.

(Sadly, my sea urchin-based grad work has less likelihood of being on APOD...)

PRE-Announcement: TurboZCherryPloRails

Mike Watkins, via quixote-users. (I'd have picked a better name if I'd known someone was going to run with it ;)

Seriously, Mike's original post is the single most sober and realistic thing said about the recent proliferation of Web frameworks in Python. Ultimately, you've still got to write the application -- and for me, at least, that's always been the hard part.


p.s. I'm getting caught being snarky too much, I think!

socal-piggies, testing & twill

At our eighth meeting, I asked Grig for some advice on testing twill. (He recently reviewed twill on his blog, so I knew he was familiar with the issues.)

You see, I'd like to write unit tests for twill, but it's hard to think of exactly how to do it. twill is a testing tool, and because it interacts with real live Web sites it's hard to imagine how to thoroughly test it. I do have a simple Quixote application for testing twill's form handling capabilities, and with the most recent release of twill I am using code coverage analysis to ensure that I was testing all of the commands. Beyond that, I've had a tough time figuring out what to do.

Grig's advice was quite simple: build unit tests for those things that are unique to twill, and only those things. Since twill is built primarily on top of mechanize, ClientForm, and ClientCookie, he suggested that I avoid testing anything to actually do with HTTP and HTML handling because that's all contained in those packages. What I should test is logic surrounding those packages -- basically any logic unique to twill.

Duh. Yeah, it was pretty obvious once Grig said it ;).


I and pretty much everyone I talk to are unhappy with BioPython. I have my usual persnickety concerns to do with general coding style, oddball documentation, funky library APIs, and so on, but oddly enough many local people seem to agree with me this time.

I have several specific issues with BioPython. First, it's stuck back in the 1.6 / 2.0 dialect of Python, so the modules that I use most -- the NCBI interface -- don't make use of things like iterators. Very annoying. Next, it's also got some funky import stuff that doesn't interact well with Quixote. Finally, it inappropriately uses shorthand in the BLAST parser, so that you end up with BLAST output getting parsed into variables named 'sbjct' for some reason. Oh, and the EUtils interface is oddly written... and I think replacing the parsers they have in there with something based on pyparsing is maybe a good idea .

Mostly minor nits, I know, but the "bad code smell" is enough to make me think about starting a new project. Since the mailing list seems to have become a spam collector, and I don't see discussion anywhere else, I have been encouraged...

Anyway, we'll see. I'd like to combine some of the pygr coolness with some simple code from slippy and also motility. A couple local people are interested in joining in, so that's nice... The cool thing is that this would be a project actually directly connected to my research work, for once ;).


twill v0.7.3

I just released a minor update to twill, twill v0.7.3. This update fixes a few minor problems, adds a few new commands, and also contains a script for stress- and performance-testing, 'twill-fork'.

See the announcement for specifics on the release, or read the docs. You can also just download it...


A Users Manual for the Universe

Here. Cute ;).

Sourpuss & unfair gripe of the day

It's hard to take the xerces-c project seriously with build instructions like this.

I particularly like the way they make you set an environment variable, although the "runConfigure" script is pretty astounding, too.

(Not that I like the default autoconf/configure much, but it's pretty standard; I have to re-read the instructions for xerces-c everytime I build it, this way.)

WSGI support for Trac

Took the existing patch for adding WSGI support to Trac and added a simple SCGI server and a dumb CGI server to wrap the WSGI app.

Here. Works for me, so far...

trac, incidentally, is pretty cool. Small, neat, to the point. And pretty.

DoS attack

Finally got the chance to investigate some intermittent Web server downtime. Turns out that it was neither name service nor Apache config problems, but instead a DoS SYN flood. This guide was useful; I got the site back up and running in no time flat. I'm still getting lots of connections, though. If it escalates I'll start using iptables to block.

I'm not sure which site they're after. Perhaps wrightflyer.org?

twill additions

Ed Rahn sent me a nice patch adding run and runfiles functionality to twill. These enable the execution of straight Python statements and other twill scripts from within twill. He also submitted something enable $variable substitution, which I'm still a bit iffy on; people tend to expect "this $variable" to be substituted, which doesn't work yet. Adding that could be a bit of a mess.

Ed's patch catalyzed a namespace reorganization and the addition of setglobal and setlocal commands, as well.

All in all, it sounds like a busy weekend -- but it wasn't ;). Lotsa sleep. hooray!


Why I Like Darcs

Gave a presentation on darcs, the distributed revision control system, at the SoCal Python Interest Group meeting yesterday. The presentation is available here.

twill news

Lambda the Ultimate, the Programming Languages Weblog, had a post about twill. I am not worthy. It was cool to see it there, though!

twill is in a nice, happy place. Most of the work I plan to do on it has to do with increasing its applicability to e.g. thread and load testing. Very little functionality needs to be added; if anything, I need to help out with mechanize. twill is, after all, just a thin wrapper around mechanize...

Speaking of mechanize, what's up with mechanoid? It took me about 5 minutes of browsing to figure out that this page was where I wanted to go, not the PyPI-referenced page. I still haven't figured out why the guy forked mechanize, either.

back to the crib,

twill 0.7.2

Released twill 0.7.2. May it live long and prosper.

It's mainly a bugfix+docs release, with one or two minor new features.


Installed Trac for the aforementioned Caltech Software Carpentry class. Pretty cool stuff. A fairly simple, straightforward project management system. It's also pwetty.

Naturally I had to get frustrated with something, so I spent half an hour hacking in SCGI support (it comes with FCGI and CGI support only). I'm happy to send out patches, but I'm not going to submit 'em to the Trac project until I have some more time to make a clean set. (Or somebody asks.) I found a note that Trac is going to add SCGI support with WSGI support sometime around 2.0; right now it's at 0.8.4. Hmm. That might be a worthwhile hour or two of my time...

I also spent an hour or so installing a hacked version of Trac that contains support for a darcs backend instead of subversion. Works really well so far! The tricky bit was noticing that trac actually contains provisions for running directly from its source directory. This is good because it has a number of moving parts which would be difficult to otherwise configure without simply installing it from scratch in some new location.



Spent much of Tuesday trying to get the lirc_streamzap module working. No luck; it segfaults. The situation is better than it was -- LIRC didn't even nominally support the StreamZap until relatively recently -- but it sucks balls that I can't get it working. Ah well, less whining, more hacking!

WinMyth rocks, even if it's still a bit slow & buggy. Good job! (I've been using MythOnMacOsx for a while now; I'm trying to get my wife hooked into our Myth setup with her WinXP box.)

Updated twill, my Web-scripting-for-testing tool, to fix a few problems. The biggest problem was that the extend_with command didn't work to import commands into the shell, although it worked fine for script execution.

And, in other news, working on a Trac install for the Caltech Software Carpentry class. More on that once it's all set up...

All of this is occurring in a general background of sadness for the victims of Katrina. I wasn't nearly this sad about 9/11, for some reason; perhaps because I think the misery was more extended in this case. Not sure. But when I read about 100 people who drowned while waiting for emergency evacuation... I can't help but get pretty upset. sigh. My wife may end up going out there to help the Red Cross with shelter ops, so we're holding off on donations 'til we see if we have to buy a plane trip. I'm feeling moderately useless myself ;(.

peace out,

Fun with docs

Cartwheel passes data to and from 3rd-party binaries. Generally, that data is in the form of sequence. So, when talking about security, I felt it necessary to add this proviso:

...if someone can encode an attack in valid IUPAC DNA or protein sequence, then it may get through...

I wonder if it's possible to write a buffer overflow for BLAST in DNA sequence? That'd be pretty neat. I'm tempted to offer a bounty, but I don't have any money to offer. Hmm. Perhaps I could offer to sequence part of someone's genome? ;)

Anyway, a first stab at docs is complete. Let's hope they're not too rudimentary to be of use.


OK, I'm going to start working on this soon. There's at least one outstanding bug that Michele Simionato has been waiting for me to fix...


29 Aug 2005 (updated 29 Aug 2005 at 18:44 UTC) »
Reflections on documentation

Over the last year or two, I've become a convert to the (rather obviously good) practice of documenting my packages. I'm not talking about the source code itself, which is more-or-less documented depending on the when I wrote it and the language I wrote it in -- e.g. Python requires fewer docs, C++ more. I'm talking about documenting APIs, developer setup, compilation tricks, and installation procedures. Writing tests is also a form of documentation, and I've been doing that a lot more, too.

Now, when I say "become a convert" I mean that I'm actually doing it now. Given the time involved, this isn't trivial: even writing sketchy documentation is a time-suck. It's also way less fun than hacking on new features. The flip side is that people are more likely to actually interact with your software if you document it. And, for me, that's one of the main points behind writing open source software...

There are some less obvious benefits, though, that add even more value than use. The main one is that writing installation instructions for other people gives me time to reflect upon just how crufty the install actually is for some of my packages. Cruft is usually easy to fix, once identified -- but I tend to work in my own little cocoon of already-installed programs, and it's hard to identify the cruft without leaving that cocoon. Another benefit is that I get to document some of the stupidities that are on my TODO list, and after thinking about them I often reprioritize my development to fix the worst ones first.

I'm sure there are other benefits as well, but those are the two that I'm really noticing.

I bring this up because I spent a few hours writing some Cartwheel install instructions. A guy down at UT San Antonio, Michael Edwards, has spent a week trying to get everything working. His e-mails inspired me to spend some time documenting, and as a bonus he's also found a few out-and-out bugs in the process. The documentation is still a work in progress, but I'm kind of amazed at just how much software I had to get working in the first place... here's a temporary link, just for posterity's sake.

Anyway, back to documenting...


104 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!