Older blog entries for titus (starting at number 48)

Daily Python-URL no longer feeding to PlanetPython?

This no longer shows up on planetpython, it seems. Intentional?

MythTV, Freevo, and Python

Mentions of free/OSS PVRs showed up in two high-profile places recently. First there was a NY Times article on MythTV, and then a linuxdevcenter article on Freevo showed up on slashdot.

We've been using MythTV for about a year now; it took me a long time to set up, but once I figured out how to configure the Linux kernel appropriately and then found Debian packages for the rest, it was pretty easy. It's completely changed our TV watching habits (albeit not necessarily for the better...) and I use its jukebox & DVD ripping interfaces a fair bit as well. It is truly a fantastic program.

Freevo, however, is less good -- and (from what I can tell) for only one reason: lack of integrated live TV watching/recording. MythTV has a recording daemon that streams live TV from the tuner card directly to a client via UDP. This is both flexible -- it underlies the entire system for TV watching, because you can do the same thing with recorded shows as well -- and nicely client-server. (My iBook has a Myth client on it, so I can watch live or recorded TV via 802.11g.) If Freevo had this feature, I'd switch in a heartbeat. They seem to be aware of this lack, to their credit, but it's probably a reasonably-sized job to implement it.

The Freevo team had some nice things to say about Python:

The language is one of the best collaborative languages I have ever used. I wonder if we could have reached the point we have without the short learning curve and power of Python and its related libraries."

...but also talked a bit about speed considerations:

"Sometimes Python is too slow for the needed task. Most of the time we can avoid such problems by rethinking the design," says Dirk Meyer, the Freevo project's 28-year-old lead developer from Bremen, Germany.

I'd be interested in hearing more about this, because my usual solution is to recode in C ;).

A friend and I have been batting around the idea of coding up a Web services API for some of the Myth functionality. GUIs can only take you so far... We'll see. It's not like I haven't got enough on my plate.

--titus

WSGI and IIS

Mark Rees posts about an IIS server interface for WSGI. I can't test this, because I don't run IIS at all, but it would be interesting to see if my simple Quixote-WSGI adapter works under it. I don't even know how people run Quixote on Windows, to be honest; I guess SCGI should work with Apache on Windows, right?

So, I asked Mark to try it out, and it turns out that the QWIP adapter successfully runs Quixote under a slightly modified version of his ISAPI. Very cool. I hadn't thought about IIS integration being a real raison d'etre for WSGI, but there it is...

Mark also politely informed me that my wsgiServeFiles class didn't return proper status codes; I was returning int(200) rather than "200 OK". I just fixed that; the fix is available via Darcs or in a nice tarball on my Darcs page.

Python urllib2 buggishness

So John Lee ran across my old post asking about new (RFC 2965) style cookies, and answered thusly: mailman correctly uses RFC 2965 cookies, but does so unnecessarily because no one is really paying any attention to them. However, he did say that it's a bug for urllib2 to not correctly handle things that the browsers handle. The fix is to change urllib2 to handle RFC 2965 cookies by default, I guess. I sent him a concise program that demonstrated the issue.

Political diversions

Sometimes I come across things that make me really proud to be an American. Then there's the other stuff we do.

--titus

Darc thoughts

Continued playing around with my Darcs repository. Now it's automatically producing nightly tarballs via the 'darcs dist' command.

I'm still having trouble planning out my actual Darcs repository usage. Since each Darcs working copy is a full repository in its own right, it seems like overkill to do what I do with CVS -- run various versions out of their own working directories, with tags for set of configuration files. Still, this is what I'm planning (suitably translated into Darcs reality).

Right now I'm envisioning a layout where the depository dependencies look something like this:

                                 ----> installed site1
                                /
master (stable) --> working branch --> installed site2
             \                 
              ----> devel branch   --> devel site
                                \
                                 ----> demo/test site

It's pretty easy to scale this to multiple developers (although at the moment that's not necessary for any of my actual big projects). You can just branch more repositories off of the 'devel'/'working' branches.

The cool bit is that I can do things like patch bugs that are in common between my working & devel branches, and then 'push' and 'pull' them up and down the tree above. It seems much easier to do this in Darcs than it has been in CVS. I hope reality matches perception!

The only snafu so far has been that Darcs is a major pain in the butt to install on non-package-managed machines. I got it running on Debian and FreeBSD with no problems, but my old Redhat machine (where I do all my development) and my iBook (which I live on) are proving more problematic. Maybe I'll write more about what !#%!#@$! annoying piece of software the Glasgow Haskell Compiler ('ghc') is later...

Quixote licensing

David Binger sez:

I'm pleased to announce that Quixote 2.0 will be released with a 
GPL-compatible license.

That's a Good Thing in my book.

Basecamp

It's official: Basecamp is extraordinarily cool. You may recall that Basecamp is written in Ruby using Ruby on Rails. Well, if Kevin Kelly's Cool Tools says it's fantastic, then dangitall it is. (Cool Tools is my go-to place for gifts of all kinds for all ages. Fantastic site.)

My sis-in-law is looking into Basecamp for use in her company. I'll be interested to see how it works out for her.

--titus

4 Feb 2005 (updated 4 Feb 2005 at 09:20 UTC) »
Darcs repositories

I'm moving some of my side projects into a darcs repository, at http://darcs.idyll.org/~t/projects/. Hopefully this will solve my dilemma: I want to make the source available, but dislike tarballs and pservers. This way they're even in a version-controlled format, and people can easily send me patches. Hooray!

Right now only my PostgreSQL session stuff for Quixote & my two simple WSGI projects are posted.

I still like Darcs just as much as I did yesterday ;).

--titus

p.s. Ian, I do like your idea of a WSGI reference library. Just didn't have anything to add other than "me too!" I know, you didn't expect restraint from me on a mailing list... sorry ;).

p.p.s. cdfrey, I agree that Web programming is nasty, brutish, and way too time consuming. I never did like mucking around with strings, even once I found languages other than C. Still, it's a necessary evil these days. I think the trick is to find a language & framework that fits your needs and your style, and then build on that.

More Tilting at Windmills

I posted my suggestions regarding the Quixote implementation in the PyWebOff to the quixote-users mailing list, and got a few useful replies.

Eric Floehr pointed out that you could throw PublishError exceptions, which trigger the first available _q_exception_handler function. This function, in turn, can do whatever it likes -- including return a login page or a redirect.

Neil Schemenauer showed off the format_publish_error function new to Quixote 2.

And, finally, Michael Watkins disagreed with my simple namespace reorganization suggestion and said that he used per-function access control because it was more flexible. (Ick. IMO. ;) He also points out that the requirement that each function be explicitly exported via _q_exports makes it less likely that you'll leave a function publicly accessible by mistake.

Charles Brandt also posted an SQLObject-based sessions implementation for Quixote; it's based on my PostgreSQL example. Both are available on the Quixote wiki. Godoy echoed many a previous comment and asked that some sort of SQL-based session-persistence be added to the default distro. It might be nice to build a catch-all library that can be grabbed by people who want some pre-packaged Quixote functionality. Hmm... Maybe tomorrow ;).

Man, though, that mailing list is friendly! Waaaaaay too much useful info flowing around...

Darcs

Started playing with Darcs a few days ago, and am setting up a few repositories. Very interesting; I can't quite fit my CVS/SVN knowledge into the Darcs mold yet, but I suspect I will be quite happy with Darcs for my own projects.

More anon.

--titus

2 Feb 2005 (updated 2 Feb 2005 at 04:51 UTC) »
Venting.

Yesterday we hired someone to run our Beowulf cluster, Web servers, database server, and Web sites. He seems like a nice, smart, enthusiastic person and apparently came highly recommended. He's replacing someone who knows our system inside & out and has been working for us for several years.

One catch (or is it five catches?): he has

  • no experience with database administration; he's used Access and SQLserver, but never adminned one.

  • no experience with Linux sysadminning.

  • no experience with Python (all of our code is written in Python).

  • no experience with database-backed Web programming, although he has written CGI scripts in Perl.

He also has no biology background. Since this was the reason given to me for firing our old admin (who came to the job with all of the above computer skills, just no biology), clearly I was lied to about that.

Oh, and I'm the only person in the lab who currently has more than two of the above 5 skills. And I'm hoping to defend & leave soon. I certainly don't want to support our computer system or train someone new.

It is unclear to me what is really going on, but I have four hypotheses. (My fifth hypothesis is that it's all a big April Fool's joke, but I asked someone else & was reassured. So either they got the whole lab involved or...)

My hypotheses are:

  1. He's really, really cheap. (Doubtful - I don't think we were paying the last guy that much more than me, and I'm a graduate student.)
  2. They just wanted a new person and didn't think about it at all.
  3. They wanted a new person before I left, and wanted me to train him, and wanted to maneuver me into this situation.
  4. They just wanted a new person, and thought this guy fit the bill perfectly.

I think the last one is the most frightening for the future of the lab, because it implies active cluelessness.

Oh, and I also heard that my software (90% of our current system) was "idiosyncratic". Well, yes, it is. Unfortunately I don't think they were talking about my particular software development choices, I think they meant "it's not shrinkwrapped, so it's weird and unsupportable".

I guess I should feel glad that my advisor is helping to push me out the door by making the lab an unpleasant place to be.

--titus

p.s. ranting over. My apologies.

p.p.s. one more thing, actually. I wasn't consulted on the firing or the hiring, and upon asking about why we hired someone with no experience, I was told that he had lots of experience and I didn't know what I was talking about. ERGHHHHHH.

30 Jan 2005 (updated 30 Jan 2005 at 09:51 UTC) »
Quixote issues

Michelle Levesque built a Quixote app as part of the PyWebOff comparison of Python Web frameworks. One of her last complaints caught my eye. Essentially she couldn't figure out how to do access control the way she wanted.

The two complaints were that

  • (a) an AccessError exception (e.g. as raised by _q_access) couldn't easily be used to redirect/return a login page, and
  • (b) every page has to check permissions explicitly.

Since _q_access is called before every page, it's the right way to check permissions at the namespace level. The two problems can thus be solved in tandem.

First of all, organize the application so that the restricted areas are in a different namespace, e.g.

/             -- contains /login, welcome page, etc.
/restricted/  -- contains restricted pages

Then write a _q_access function in the 'restricted' module that raises a specific exception -- either a subclass of AccessError, or not, doesn't matter. In an application-specific publisher class, catch & handle this exception:

class MyPublisher(SessionPublisher):
    ...

def try_publish(self, request, path): try: return SessionPublisher.try_publish(self, request, path) except NotLoggedIn, e: return "you should log in"

In place of the "you should log in", you can return a redirect (which is what I would recommend) or else print out a page with the appropriate login form.

I admit this is neither the most intuitive nor the most obvious solution in the world if you're not familiar with Quixote, but it makes sense to me ;).

One thing that Michelle may have missed (and maybe it needs to be highlighted in the Quixote documentation or something) is that Quixote is all about namespaces. Organize things hierarchically -- either by object or by module -- and your Quixote apps will flow.

That is all.

--titus

WSGI middleware: a simple commenting system

I finally sat down and implemented some simple commenting software today. I wanted something like the old ACS commenting system (now static) but simpler: in particular, I didn't want to have to use an SQL database or the ACS. (I have a co-loc server that only runs Apache/Python, and I don't want the security headaches of keeping up-to-date with anything else.)

Today I realized that I could probably implement this pretty easily in two WSGI components: one (app) component to serve static files, and one (middleware) component to wrap particular path elements in comments.

Ta-da! http://www.idyll.org/~t/articles.cgi/, with source code here. It's only five files (if you count the CGI-WSGI server) and I think the wsgiComment class is fairly generic.

I guess now I should post some articles, ehh? ;)

--titus

wspace, I don't really understand the ranking system, even though I've read raph's posts. (Also, when I go to recentlog.html w/o cookies, I see the posts just fine.) I don't think this stuff affects RSS, though, because my diary entries show up on planetpython.org.

sye, I'm curious -- what other virgule sites do you frequent?

Victory, not Death: Extension module code coverage

My test code for the paircomp C++ library is written in Python, and uses an extension module to access the code. I'd played around with coverage.py in other contexts (nice module, BTW), and I wanted to check coverage of my C code.

A few days ago, I tried to get gcov to work on my paircomp code. I gave up because I couldn't get Python's modules to load properly when I built a staticly linked embedding of Python. I cleverly (<-- sarcasm) realized that I should perhaps look at the demo embedding that came with Python. It turns out that there was a handy little flag to ld, --export-dynamic, that I'd previously missed.

With that in place, everything worked. Woot! I can now see exactly what lines of C++ code are executed by my Python test scripts. I'll try to write up a short article for others who want to do the same thing, because it was moderately tricky; I still need to clean it up before exposing it to public view, though!

Sailing the Sargasso Sea

Patrick Logan points to an article on Craig Venter, who is out trawling the oceans for new microbes. Patrick, you're dead right: 90% of the stuff I use in the lab every day is the only mildly re-engineered product of some organism or another. Restriction enzymes, Taq/Vent, reverse transcriptase, plasmids...

The anonymous commentator misses three things when he disses on Craig:

  1. The current genome boom is directly attributable to Venter, who pushed whole-genome shotgun through a skeptical community. The approach has since been validated on numerous organisms & is probably the single most important genomic technology of the last 20 years. Certainly it's in the top 5. (He also pushed EST sequencing, which was immensely valuable.)

  2. All of the data from the Sargasso Sea was made available directly to the public. I've downloaded and used the two new Shewanella genomes myself. Good stuff. There's no way he can even begin to patent 99.9% of the stuff he found, because (a) he found an awful lot and (b) the patenting guidelines have become much stricter. His sequencing is a net gain to the community, even if he gets to skim the 0.1% of the cream off the top.

  3. Given the cost of environmental sequencing on this scale, it's arguable that only a high-profile person like Venter could raise the money to even do the feasability study, much less carry it through. That's valuable. (Incidentally, he's certainly not the first person to do environmental sequencing.)

So he may be a high-profile jerk-off, but he's inarguably benefitting humanity, too.

People who are interested in the story of the Human Genome Project should read Sulston's A Common Thread. Fantastic book. He does hate Venter's guts, though, which seems to be a common reaction to the guy ;).

--titus

(It's just work avoidance day, I guess...)

More on Ruby

I became curious about Ruby's approach to threads after reading about Cilk, a neat-looking extension of the C language that provides a system for doing multithreaded parallel programming. (Cilk's magic is in the scheduler, it seems.)

One of the persistent nags about CPython is that the global interpreter lock prevents execution of threads on multiple processors. If Ruby handled it differently (& "better"), maybe we could swipe the idea for Python. Long story short: after a bit of investigation, I infer that the Ruby interpreter isn't thread safe [1], [2]. (FWIW, after having written several C extension modules for Python, I think the way Python handles it is very clean and simple.)

Hmm, now that I think of it, I'm a bit surprised that Dr. Dobbs' troll article didn't trigger any (blog) discussion of the GIL in Python...

Extending Python, GvR, and the benefits of dictatorship

Damien Katz wrote a great story about his experience with the Lotus Notes Formula Engine, and I just wanted to share this quote:

Now you might think that I produced a bunch of design documents and specifications and presented them to the various senior engineers and architects, but I didn't. I remember being surprised by this myself. Even Wai Ki [his boss] didn't have much to say about my design or how it should be implemented. The philosophy was that if I did those things, everyone would meddle with the design and nothing would get done. It's truly easier to ask forgiveness than to ask permission, not to mention things get done a lot faster if you just do them.

I think the application in the context of the decorator and static types kerfuffles is pretty obvious. Even if I disagree with some of GvR's decisions, it's clear that sometimes (often? always?) the vision of a strong dictator is preferable to design by the masses ;).

Still, I must have missed something: on the Amazon Web Services Blog, GvR is caught saying "It may take another generation of programmers to get over the prejudice for static typing." I can't find the reference now (google only does so much when you can't remember any !#%!#%! keywords) but there's someone syndicated on PlanetPython who said, in effect, "GvR just wants to put this stuff in to Python to demonstrate that static typing doesn't matter". Hmmmmmmmmmmmmmmmmmm...

Backing up Advogato diaries

sye asks how my pull-advogato script differs from using wget to pull down the diary.

Maybe I'm missing something, but here goes:

  • One is programmatic, one is not. Since I was using this to try switching to PyBloxsom, with a slightly different post format, I wanted to do some content modification. (I removed the content modification from the version posted.)

  • One pulls down the entire diary, the other pulls down only new entries. (It can easily be changed to pull down only modified entries, but that was less useful than I thought because of the way pybloxsom works.)

  • One requires the use of an XML parser to grok the output, the other does not.

He is, of course, 100% accurate about there being no difference for the purpose of making a backup - although I don't think advogato takes its own XML back in, per se, so you couldn't restore directly from the XML.

ORMs are Object-Relational Mappings

I was unnecessarily gnomic the other day when I was thinking aloud about PostgreSQL and cucumber. Still, someone understood me; Jacob Smullyan also uses PostgreSQL's table inheritance to underlay Python class inheritance, but he does it using PyDO ("use the latest code from CVS", he said). Someday soon I hope to meander through PyDO and SQLObject and steal any good ideas for my own code.

"Testing Darwin"

Discover magazine just published a great article on Avida. Good stuff!

--titus

p.s. mirwin -- thanks!

39 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!