Older blog entries for titus (starting at number 150)

Specific bitching

  1. I hate automake/autoconf/aclocal/autoscrewitallup. CHECK THE RESULTING CONFIGURE SCRIPT INTO YOUR DAMN SOURCE REPOSITORY. That way I don't have to spend 5 hours tracking down EXACTLY which version of these utilities you used so that I can check your source code out and compile it.

    (With reference to libhid.)

  2. If you are making a library for Python application developers, DON'T USE FEATURES SPECIFIC TO PYTHON 2.4. Python 2.3 is the version available on Mac OS X 10.4, and that's the largest installed base of Python in existence. More to the point, I don't care about your whizbang application library if by using it I penalize all Mac users.

    (With reference to lots of packages.)

The GUI Testing Tarpit

Brian Marick speaks about his solutions for GUI testing (by which he seems to mean "Web testing"). While I think his approach has some merit, I'm mildly skeptical about it, but I don't really have anything intelligent to say in response. More anon, presumably.

In other news, Jeremy Jones suggested that a Web unit testing framework be included with Python. Here again I'm skeptical, but only because I don't like the general approach.

Even if I did think it was a good idea, which package? twill can't really be considered a unit-testing framework, and it's too sizeable anyway. webtest is nice, as is webunit; I guess I'd go with webtest, because it's really simple.

I do think mechanize should be considered for inclusion in the standard library, however -- or at least ClientForm. They're very useful and (apart from the mechanoid fork) there's nothing else even remotely approaching their level of functionality.

SvnReporter looks cool

http://www.calins.ch/software/SvnReporter.html: to quote,

"""

SvnReporter generates various reports in response to commits happening in a Subversion repository. It is intended to be called from the post-commit hook.

Two types of reports are supported: single-commit and commit list reports. The former generate reports relative to a specific revision only, and are typically used to generate post-commit mails. The latter generate reports relative to a list of commits, e.g. an RSS feed or a web page showing the latest commits.
"""

I'll install it soon and report more.

RadioSHARKage

So, I decided to give my wife a tech solution for our 2nd wedding anniversary. (This would be more dangerous were she not as gadget infested as I...)

The problem: she would like to be able to timeshift and/or download certain radio shows to her iPod. Most of these radio shows (CarTalk and Wait, Wait, Don't Tell Me! are the two big ones) are either for-pay only or aren't actually available at the moment.

My solution: the Griffin radio SHARK.

The Radio Shark first caught my attention when Declan Fleming wrote about using it for radio capture and self-podcasting. Then I noticed that Michael Rolig got it working under Linux. So, I went and bought one, with the ultimate goal of hooking it up to our MythTV box and setting up automatic recordings and exporting them to podcasts.

It worked right out of the box on my iBook, of course: I plugged it in, installed the software, and was immediately listening to live radio.

Linux, as usual, proved a tad trickier. Here's a rough outline (to be gussied up as I play more):

  1. Plug in the Radio Shark and verify that it's seen (using lsusb).

  2. Spend several hours playing around with libhid, only to figure out that Michael Rolig is using the latest Subversion checkout of libhid and I can't figure out how to compile it. (See my complaint about autof*ck, above).

    Ultimately,

    1. grab the latest release of libhid;
    2. patch it with the Debian stuff;
    3. monkey-patch in the interrupt_write function;
    4. rebuild

    (Shockingly, this worked!)

  3. Transliterate shark.c into Python (shark.py).

  4. Install ecasound.

  5. Insert the audio kernel module (linux 2.4.28) so that the Radio Shark is attached to a sound device (in my case, /dev/dsp2).

  6. Use the appropriate ecasound magic to listen:
    ecasound -D -f:s16_le,2ch,48000,inter-leaved -i:/dev/dsp2 -B:nonrt -z:db -b:4096 -o:/dev/dsp
    

  7. Use the appropriate ecasound magic to save:
    ecasound -D -f:s16_le,2ch,48000,inter-leaved -i:/dev/dsp2 -B:nonrt -z:db -b:4096 -o:radio-recording.mp3
    
  8. Scrape the rss-maker.sh and urlencode.sed stuff off of Declan's page. Run them and put the files in the right place.

  9. and... voila, it all works! iTunes recognizes the podcasts and it all Just Works on the client side.

The final file size is approximately 30mb/hr, after postprocessing with

lame -S --lowpass 6000 -q 2 --vbr-new -V 8 in.mp3 out.mp3

Ultimately I'd like to start using an official libhid version, but that will have to wait for a new libhid release. I'm also hoping to use libhid to get our Streamzap remote working, but that's waiting on a different bug...

cheers,
--titus

Tutorial

I'm surprised at how popular our tutorial is. Did we hit on some magic buzzwords or something!?

Job posting

Someone contacted me via socal-piggies (I guess "Python los angeles" will find us?) and asked me if I wanted to apply for a job. I ended up getting a complete job description out of her -- I was doubtful that "proficient in Python" was sufficient, although that's all she sent me at first! -- so here it is.

The ideal candidate will have a strong expertise in search technologies,
distributed systems, Unix/Linux architecture with exceptional skills in
python and shell scripting. Individual will also have some experience in
object-oriented web development and system administration.

Requirements:

. Fluent in Python . Fluent in C++ programming language . Fluent in shell scripting . Fluent in distributed computing and multi-tier application architectures . Fluent in UNIX/LINUX architecture and OS principles . Experience with ORACLE . Bachelor's degree (minimum)

The web site is somewhat non-descriptive: carreraagency.com. Contact sandra.bunn at the obvious domain name if you're interested.

(I pointed her towards the Python jobs board, too.)

--titus

The 30-second Guide to Making Eggs

Courtesy of some prodding from some irish guy.

At the top of your setup.py (you do have one, right?) add:


### ez_setup.py and 'use_setuptools()' will automagically download and ### install setuptools. The downside to this code is that then ### you need to include ez_setup.py in your distribution, too.

try: from ez_setup import use_setuptools use_setuptools() except ImportError: pass

### this is the critical line from setuptools import setup # instead of the 'distutils.core' setup

### also import Extension, etc -- anything else you need -- from setuptools.

then you can make eggs with the bdist_egg command. Try:

% python2.3 setup.py bdist_egg
% python2.4 setup.py bdist_egg

to build eggs for each version of Python you have installed. The eggs will end up in build/. If you're distributing precompiled binary code, you'll need to make an egg for each platform/Python version.

Anyway, hope that helps!

--titus

p.s. I should probably write a "5 minute guide to distributing Python software properly". Lessee... (1) make setup.py; (2) add it to PyPi; (3) post to c.l.p.a. ;) Oh, and (1a) run cheesecake!

For future reference: BBQ

Educated Guesswork posted a link to a mail order BBQ store. Yummmm.

The first comment on the post has a few more references.

(Oddly, this diary is turning out to be a great place to keep such links: I can always find them, even weeks later. Much better -- and more persistent -- than my own fairly random collection of HTML pages...)

Good Google idea (?)

Wouldn't it be cool if Google could show you when a page last changed significantly? (I was once involved in a company that was doing things like this; obviously it's hard without the proper infrastructure, but Google clearly has that.)

Hmm, I just realized that Google might already be ordering search results that way... although the amount of outdated Linux stuff I google up is large enough that I'd guess they're not.

New package overload

I've spent much of my programming time over the last two weeks getting a fingerhold on a bunch of library modules and packages. In our tutorial application, Grig and I are using CherryPy, Commentary, Durus, AMK's code implementing JWZ's mail threading algorithm, imaplib, and poplib. (And that's not counting the installation stuff and testing packages, like coverage, Selenium, Fitnesse, and soon Buildbot.) We're trying out some of these new packages because I want to play with new technologies (that's why I'm using CherryPy instead of Quixote); for others, like Durus (or some Python embedded db...) and the e-mail stuff, it's a required part of the project.

The last two packages I'm planning to start using are buffet and Cheetah; the time has come to do a proper job of HTML display in our application, and I wanted to try something other than Quixote's PTL.

I have to say that learning this much new stuff all at once is mildly overwhelming, but it's a lot of fun. I'm trying to get it all out of the way by the first release (which was scheduled for last week, but got moved a week because of all of the holidays). The second release will focus on performance, and the third release -- just in time for the tutorial... -- will focus on compleat testing coverage and frilly features. For now, I'm trying to get a sort of tracer bullet implementation of the basic features going. Grig is having a lot of fun playing with the different testing packages, I think, and we're both having a lot of fun assigning tickets to each other in Trac. No fistfights yet; perhaps that's a good feature of remote development ;).

In the process of doing all of this random grokking, I've made some simple sequence-style wrapper interfaces for poplib and imaplib, and fixed up the jwzthreading code a tad. Nothing big, but I guess I'll post them somwhere in the next few weeks...

Oh, one plea: does anyone know of a high-performance lazy-parsing e-mail parser? I'd like something that will retrieve header fields on demand, rather than pre-parsing them, and will also not parse the entire message all at once. The closest thing I've found is this cookbook recipe which talks a bit about the 2.4 email module.

--titus

CherryPy is horrible!

Well, no, it's actually quite nice -- but too many people are happily burbling about TurboGears, so I felt I had to say something mean. ;).

Ivn Krstc's madness: to be revealed soon

Scanning through the PyCon talk schedule, I saw this. So soon we will be able to see what Ivan Krstic was talking about back then.

In other news, enough people have signed up for our PyCon tutorial that I can definitely afford to come. Thanks, all -- now we just have to worry about the presentation...

Jobbing you

Indeed, as haruspex pointed out, Steve Jobs never did graduate from Reed. We still consider him one of ours -- Reed officially defines an alumnus as anyone who has attended one full semester. (...because that way, they can call and ask for $$, I think.)

--titus

Perils of Javaity

Joel Grossberg's comment in Ian Bicking's post on Joel Spolsky's article reminds me: my college, Reed, didn't have a CS department or curriculum. There were two intro programming classes, but that was it.

Despite this lack of formal CS education, Reed graduated Keith Packard (X-anything), Nelson Minar (emacs HTML mode, among other things), and half-a-dozen other excellent programmers. Well, and Steve Jobs, too ;).

Just thought I'd mention it.

The Tom Peters Blog Challenge

Via this blog, John O'Leary asks:

1. What's the most important thing you've done this year?

Attending the Woods Hole Embryology Course was definitely the most important thing I did.

Computationally, I'd have to say that twill is probably the most important thing I did. Sigh, it's the first year in a while that some component of my research programming isn't the most important thing I've done, but I'm avoiding that in an effort to actually graduate.

2. What's the most important thing you'll do in the next year?

Graduate, I hope -- or leave graduate school in some way. I'm stuck in a miserable situation where fairly slow experiments are dictating my future. I'm applying for jobs, but without the experimental evidence I'm going to have a tough time of it.

In addition, our Agile Development & Testing project could turn into something quite nice, but I refuse to depend on that ;).

Also, three posts -- Dave Winer (on RSS), Ian Bicking (with his Python docs stuff), and Jonathan Ellis (on ORM stuff) -- have made me decide that one New Year's resolution (category: "programming") will be to learn to play better with other projects. In particular, I'd like to put together a Trusted Commentary application that uses the Advogato trust metric together with Commentary and a simple central-server-based commenting setup to make it easy to deploy a commenting system on individual sites. I'd also like to see if I can integrate cucumber2 ideas into SQLAlchemy or PyDO. We'll see how well these projects go ;).

--titus

Python quote

Here's an entertainingly odd excerpt from ziggy's post on Joel Spolsky's The Peril of Java Schools:

...

Python strives to have one (obvious) way to do it, which makes all source code "look" the same. Thus, looking at a piece of random Python code, it's difficult to tell at a glance whether it's good or not.

Perl, on the other hand, with its trademark TMTOWTDI, makes it easier to see when code is good or bad. Because it blends aspects of C, Lisp, and OOP, a good Perl programmer will have to think at multiple levels of abstraction to get a complex job done.

...

Elsewhere, Ned Batchelder weighs in with some commentary on the Perils of Java -- the comments are worth reading.

Also worth reading: 'Illusions of Chaos, Illusions of Calm' (via Sean McGrath).

--titus

The Perils of Java Training

JoelOnSoftware takes aim at Java. Hmm, I wonder if you advertised for programmers with experience in Haskell, OCaml, or Scheme, you'd get better interviewees?

Paranthetically, Joel Spolsky is one of the two or three people I find always worth reading. Paul Graham is another, of course, and so is Martin Fowler. (I used to read Philip Greenspun, too, but I feel that he aims for controversy over content and that started to annoy me.)

Our cat is smokin'

Our cat, Muika, is a smoked Egyptian Mau (see the picture titled "Smoke"). He's a dead ringer for the kitty on this page. What's funny is that his original owner got him from a pound! However, based on his near identity to the cats in these photos, I'd guess that he's a purebred.

Google-Fu

I've been noticing that advogato.org has powerful google-juice. This has amusing consequences: for example, whenever I try to find the home page for nosetest or nosetests I find my own diary entries on nose. (I guess I could be the only person blogging about it, too.)

Subversion, Darcs, and Trac, oh my!

I've been adding unit tests to some of my older packages, and revamping them as I have time; now I want to put them on my new development server.

Problem:

  1. Many different projects. (Just focusing on my moderately used bioinformatics analysis stuff, I have: Cartwheel (Python server & Python/C++/Java client stuff); FamilyRelationsII (C++ GUI written with FLTK); motility (C++ toolkit with Python interface); paircomp (C++ toolkit with Python interface); and sister (XML-ish parsing code in C++ and Python).

  2. Many interdependencies. (FRII uses the C++ Cartwheel client stuff, as well as all C++ code. All of them interact with data produced by Cartwheel, which in turn uses some of the toolkits to produce the data, but this time from the Python code.)

  3. Other users. Not only do I run a Cartwheel server myself, but other people run them and use FRII to interact with them. There are also several other people using the toolkits.

  4. Poor hosting on SourceForge. All of these projects are spread among the Cartwheel and FamilyJewels projects on SF. No subversion or darcs; no trac; lousy bug tracking interface; etc. At this point I'm just using the centralized CVS & the mailing lists.

Possible Solutions:

I'll definitely be using Trac and something a bit newer than CVS. But what? My two best options are:

  • Switch to using Darcs.

    • Pluses: I like Darcs; no complicated setup/maintenance; users (including me) can customize my deployment settings for those projects I deploy in place; users can "fork" at will.

    • Minuses: No subtrees, so I'd have to have a distinct Trac site for each Darcs repository. Maintaining version synch between them might also be annoying. Using darcs on Windows sounds miserable.

  • Switch to using Subversion.
    • Pluses: support for subtrees, e.g. "bio-tools/cartwheel", "bio-tools/motility", "bio-tools/paircomp" can all be distinct check-out-able projects within a single repository and with a single Trac instance.
    • Minuses: centralized, so users cannot fork at will without moderate investment in technology like SVK or tailor; moderately nasty setup/maintenance.

After writing this all down, I think the winner is going to be a moderately complicated hybrid setup.

Proposed solution:

  1. Dump each interdependent group of projects into Subversion, and attach a centralized Trac server.

  2. Develop in this svn repository, and synchronize releases among all projects.

  3. Export trunk & release branches to darcs via tailor.

This answers most of the minuses above, albeit while making my life more miserable with respect to configuration.

  • users can branch off of the darcs stuff without requiring r/w access to my svn repositories;
  • they can use svn if they just want an easy checkout, without investing effort in new technology, or they can use darcs;
  • All of my interdependent projects are maintained in the same repository, so I can synchronize stuff with tags;
  • I get a single Trac instance for all of my interdependent projects.

Re-reading this, I think I might be nuts -- but it's a good kind of nuts ;). I'll think on it; there's no urgency in implementing this.

The only thing that might make my life easier would be a real-time tailor-style svn-to-darcs converter, so that I don't have to maintain separate tailor directories. But that's a minor issue.

--titus

SCGI links

I'm a big fan of SCGI, a Fast CGI-like way of running a persistent server in an external process. There are a number of reasons why SCGI is pretty nice: it's clean, simple, has a simple protocol specification, and has a nice library. One reason I like to use it is that it integrates well with Apache: I can easily configure virtual hosts that direct requests to Web apps via SCGI, and the actual Web app can run as whatever user you want.

I tend to run a lot of SCGI servers, however, because I deploy a number of Web sites for various people. At the moment, I simply run them in screen, but that's awfully tedious for 15 different Web apps, and it doesn't work well when you have to reboot machines ;).

So, today I googled around a bit for start/stop scripts -- I'm not sure exactly what I'm looking for, but figured a bit of basic research would be a good start -- and noticed two amusing factoids.

First, there's a Wikipedia entry for SCGI! Kinda cool, even if it just a stub.

Second, some Ruby folk seem to have embraced it: check out Zed Shaw's Ruby On Rails SCGI Runner page. Neat -- I didn't realize SCGI was used outside the Python community. This page also has a long list of reasons why SCGI is particularly neat; it's worth reading.

I also ran across some nice links: flup, and Deploying TurboGears with Lighttpd/SCGI.

--titus

Merry Christmas, Happy Holidays, and an Enthusiastic Kwanzaa!

I probably won't write much over the weekend, so... have a good time over the holidays, folks.

WSGI, Paste, marketing.

I was going to write a long entry on "what is WSGI", partly in response to Ian's post, but I'd rather code than opine. At least for today ;).

Still, here's my personal take on the situation. In general, I'm conservative: I didn't like list comprehension, iterators, and metaclasses when they were first implemented. (I still hate the decorator syntax.) I didn't think WSGI was worth much at first. And I still have a hard time comprehending the structure of Python Paste, if not the intent.

I've changed my mind about list comprehension, iterators, metaclasses, and WSGI. I suspect I will eventually end up changing my mind about Paste; I'm already contemplating implementing stuff with Paste hooks.

My hunch, based on my experience with new Python features and WSGI, is that Ian (like GvR and PJE) is busy solving problems that I will not encounter for quite some time. It may take me a while to figure that out, and I may even end up not liking Ian's choices. But I strongly believe that WSGI and Paste are better long-term bets than Yet Another Python Web Framework -- it's like betting on a decoupled, decentralized content delivery system rather than relying on a few large content providers to make the right technical choices.

If there's one problem I'd like to solve, it's the marketing problem we seem to have with WSGI and Paste. It's time to change the effin' names. More on that next time I'm feeling creative.

Paranthetically, I've also been thinking on and off about making a proposal to unify Web handling in the Python 3000 std lib with a 'web' module; 'web.interface' for WSGI, 'web.url' for URL handling, 'web.browser' for a mechanize-style browsing interface, 'web.cgi'... etc. Anyone interested?

--titus

141 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!