Older blog entries for titus (starting at number 51)

Reputation and certification systems

I'm on the hunt for reputation/certification systems, a la advogato/mod_virgule, slashdot, and kuro5hin. I found a simple alternet story, too. And someone somewhere mentioned Aura, which I have yet to read about.

I'm particularly interested in hearing about persistent rep/cert systems that (in your opinion) tend to accurately reflect people's rep. In other words I'd like to find out about ones you find effective ;). I'm most interested in hearing about communities with a small core, not too much turnover in that core, but a lot of "fringe participators".

E-mail me, please! I'll summarise and acknowledge, yada yada. TIA!

PostgreSQL and IBM's patent application for ARC

Haven't seen this show up anywhere else yet; it's a GeneralBits discussion of the patent issues surrounding the "Adaptive Replacement Cache". The ARC algorithm is in PostgreSQL 8.0, but IBM has applied for a patent. What to do? Read on...


Three HST items: writing a Taschen book, surprised he's still alive, and Breakfast with Hunter. The movie looks fantastic...

Old Software

dominion. Didn't know this was still lying around! That's my first collaborative code project... And hey, here's my first publicly posted software project, 'ring'. Sheesh, June 1991...



In an almost completely off-topic response to Graham Fawcett on the quixote-users mailing list, I suggested using memcached to store per-session information in memory. Then I got curious, and wrote a quicky example of using this as a persistence store for Quixote user/session information. The result is under my projects page as qx-memcache-example. Kinda neat -- sessions are "persistent" as long as memcached is running ;).

I heard about memcached, incidentally, through reading advogato. I think one of the maintainers is on here. Can't find it, though, because my google-fu has failed me.

On Ayn Rand

Via Educated Guesswork, Nora Ephron on Ayn Rand:

Like most of my contemporaries, I first read The Fountainhead when I was 18 years old. I loved it. I too missed the point. I thought it was a book about a strong-willed architect...and his love life.I deliberately skipped over all the passages about egoism and altruism. And I spent the next year hoping I would meet a gaunt, orange-haired architect who would rape me. Or failing that, an architect who would rape me. Or failing that, an architect. I am certain that The Fountainhead did a great deal more for architects than Architectural Forum ever dreamed. Nora Ephron, The New York Times Book Review (1968)


8 Feb 2005 (updated 8 Feb 2005 at 20:22 UTC) »
On multiple projects

I have a serious projects problem. Not only do I have my academic "bread & butter" projects (FamilyRelations/Cartwheel, paircomp, and motility), but my academic side projects ("bogs", a rewrite & extension of the binding site software we published earlier), my old academic projects (at this point limited to semi-supporting old script hacking I did over the last few years so that we can publish the work), some consulting work (collar), my simple OR mapping software (cucumber), a bunch of Python bug/patch fixups, the various random WSGI stuff I've been talking about here, and the support code for various projects (things like PBP hacking & code coverage analysis tools), but I've got a few other things planned, too. Like khmer, a genome-scale k-mer analysis toolkit, and a project I plan to call "annotated python" that I hope will help me explore some ideas about certification & distributed annotation that I plan to apply to genome annotation some day soon.

This may be the real reason why many OSS projects never reach 1.0, Ned... too many projects per person!

Some of these projects are essentially my version of thinking out loud. But all too many of them are being used by other people and need to be supported; all of them (with the exclusion of my old academic scripts and some of the WSGI stuff) are being actively used by me in my research. Waste not, want not...

It's extraordinarily advantageous (for me) to be able to take a break from one project and attack a different set of problems every so often. Moreover, often I can get a good angle of attack on one problem by solving another one. For example, my need to write a flexible conference submission system (collar) lead to the design/implementation of my OR mapping (cucumber) which in turn solved some of the problems I was having with my bioinformatics system (Cartwheel). Another example is paircomp & motility: I needed a C++ library to load/manipulate comparative sequence analyses, and a C++ library to search for flexibly-defined sequence motifs. Initially they were used in FamilyRelationsII, my FLTK-based visualization GUI, but they then became very handy for some as-yet unpublished work in comparative sequence analysis & the motif analysis that's part of bogs.

My belatedly defined New Year's goal is to bring paircomp and motility up to 1.0 this year. Both of them are C++/Python toolkits being actively used by at least one group external to my lab, and I'm pretty happy with the APIs. Just gotta get some serious testing into motility.

Oh, and I want to graduate. Soon. No snarky comments about time allocation, please -- it's too easy to make fun of me ;).


Daily Python-URL no longer feeding to PlanetPython?

This no longer shows up on planetpython, it seems. Intentional?

MythTV, Freevo, and Python

Mentions of free/OSS PVRs showed up in two high-profile places recently. First there was a NY Times article on MythTV, and then a linuxdevcenter article on Freevo showed up on slashdot.

We've been using MythTV for about a year now; it took me a long time to set up, but once I figured out how to configure the Linux kernel appropriately and then found Debian packages for the rest, it was pretty easy. It's completely changed our TV watching habits (albeit not necessarily for the better...) and I use its jukebox & DVD ripping interfaces a fair bit as well. It is truly a fantastic program.

Freevo, however, is less good -- and (from what I can tell) for only one reason: lack of integrated live TV watching/recording. MythTV has a recording daemon that streams live TV from the tuner card directly to a client via UDP. This is both flexible -- it underlies the entire system for TV watching, because you can do the same thing with recorded shows as well -- and nicely client-server. (My iBook has a Myth client on it, so I can watch live or recorded TV via 802.11g.) If Freevo had this feature, I'd switch in a heartbeat. They seem to be aware of this lack, to their credit, but it's probably a reasonably-sized job to implement it.

The Freevo team had some nice things to say about Python:

The language is one of the best collaborative languages I have ever used. I wonder if we could have reached the point we have without the short learning curve and power of Python and its related libraries."

...but also talked a bit about speed considerations:

"Sometimes Python is too slow for the needed task. Most of the time we can avoid such problems by rethinking the design," says Dirk Meyer, the Freevo project's 28-year-old lead developer from Bremen, Germany.

I'd be interested in hearing more about this, because my usual solution is to recode in C ;).

A friend and I have been batting around the idea of coding up a Web services API for some of the Myth functionality. GUIs can only take you so far... We'll see. It's not like I haven't got enough on my plate.



Mark Rees posts about an IIS server interface for WSGI. I can't test this, because I don't run IIS at all, but it would be interesting to see if my simple Quixote-WSGI adapter works under it. I don't even know how people run Quixote on Windows, to be honest; I guess SCGI should work with Apache on Windows, right?

So, I asked Mark to try it out, and it turns out that the QWIP adapter successfully runs Quixote under a slightly modified version of his ISAPI. Very cool. I hadn't thought about IIS integration being a real raison d'etre for WSGI, but there it is...

Mark also politely informed me that my wsgiServeFiles class didn't return proper status codes; I was returning int(200) rather than "200 OK". I just fixed that; the fix is available via Darcs or in a nice tarball on my Darcs page.

Python urllib2 buggishness

So John Lee ran across my old post asking about new (RFC 2965) style cookies, and answered thusly: mailman correctly uses RFC 2965 cookies, but does so unnecessarily because no one is really paying any attention to them. However, he did say that it's a bug for urllib2 to not correctly handle things that the browsers handle. The fix is to change urllib2 to handle RFC 2965 cookies by default, I guess. I sent him a concise program that demonstrated the issue.

Political diversions

Sometimes I come across things that make me really proud to be an American. Then there's the other stuff we do.


Darc thoughts

Continued playing around with my Darcs repository. Now it's automatically producing nightly tarballs via the 'darcs dist' command.

I'm still having trouble planning out my actual Darcs repository usage. Since each Darcs working copy is a full repository in its own right, it seems like overkill to do what I do with CVS -- run various versions out of their own working directories, with tags for set of configuration files. Still, this is what I'm planning (suitably translated into Darcs reality).

Right now I'm envisioning a layout where the depository dependencies look something like this:

                                 ----> installed site1
master (stable) --> working branch --> installed site2
              ----> devel branch   --> devel site
                                 ----> demo/test site

It's pretty easy to scale this to multiple developers (although at the moment that's not necessary for any of my actual big projects). You can just branch more repositories off of the 'devel'/'working' branches.

The cool bit is that I can do things like patch bugs that are in common between my working & devel branches, and then 'push' and 'pull' them up and down the tree above. It seems much easier to do this in Darcs than it has been in CVS. I hope reality matches perception!

The only snafu so far has been that Darcs is a major pain in the butt to install on non-package-managed machines. I got it running on Debian and FreeBSD with no problems, but my old Redhat machine (where I do all my development) and my iBook (which I live on) are proving more problematic. Maybe I'll write more about what !#%!#@$! annoying piece of software the Glasgow Haskell Compiler ('ghc') is later...

Quixote licensing

David Binger sez:

I'm pleased to announce that Quixote 2.0 will be released with a 
GPL-compatible license.

That's a Good Thing in my book.


It's official: Basecamp is extraordinarily cool. You may recall that Basecamp is written in Ruby using Ruby on Rails. Well, if Kevin Kelly's Cool Tools says it's fantastic, then dangitall it is. (Cool Tools is my go-to place for gifts of all kinds for all ages. Fantastic site.)

My sis-in-law is looking into Basecamp for use in her company. I'll be interested to see how it works out for her.


4 Feb 2005 (updated 4 Feb 2005 at 09:20 UTC) »
Darcs repositories

I'm moving some of my side projects into a darcs repository, at http://darcs.idyll.org/~t/projects/. Hopefully this will solve my dilemma: I want to make the source available, but dislike tarballs and pservers. This way they're even in a version-controlled format, and people can easily send me patches. Hooray!

Right now only my PostgreSQL session stuff for Quixote & my two simple WSGI projects are posted.

I still like Darcs just as much as I did yesterday ;).


p.s. Ian, I do like your idea of a WSGI reference library. Just didn't have anything to add other than "me too!" I know, you didn't expect restraint from me on a mailing list... sorry ;).

p.p.s. cdfrey, I agree that Web programming is nasty, brutish, and way too time consuming. I never did like mucking around with strings, even once I found languages other than C. Still, it's a necessary evil these days. I think the trick is to find a language & framework that fits your needs and your style, and then build on that.

More Tilting at Windmills

I posted my suggestions regarding the Quixote implementation in the PyWebOff to the quixote-users mailing list, and got a few useful replies.

Eric Floehr pointed out that you could throw PublishError exceptions, which trigger the first available _q_exception_handler function. This function, in turn, can do whatever it likes -- including return a login page or a redirect.

Neil Schemenauer showed off the format_publish_error function new to Quixote 2.

And, finally, Michael Watkins disagreed with my simple namespace reorganization suggestion and said that he used per-function access control because it was more flexible. (Ick. IMO. ;) He also points out that the requirement that each function be explicitly exported via _q_exports makes it less likely that you'll leave a function publicly accessible by mistake.

Charles Brandt also posted an SQLObject-based sessions implementation for Quixote; it's based on my PostgreSQL example. Both are available on the Quixote wiki. Godoy echoed many a previous comment and asked that some sort of SQL-based session-persistence be added to the default distro. It might be nice to build a catch-all library that can be grabbed by people who want some pre-packaged Quixote functionality. Hmm... Maybe tomorrow ;).

Man, though, that mailing list is friendly! Waaaaaay too much useful info flowing around...


Started playing with Darcs a few days ago, and am setting up a few repositories. Very interesting; I can't quite fit my CVS/SVN knowledge into the Darcs mold yet, but I suspect I will be quite happy with Darcs for my own projects.

More anon.


2 Feb 2005 (updated 2 Feb 2005 at 04:51 UTC) »

Yesterday we hired someone to run our Beowulf cluster, Web servers, database server, and Web sites. He seems like a nice, smart, enthusiastic person and apparently came highly recommended. He's replacing someone who knows our system inside & out and has been working for us for several years.

One catch (or is it five catches?): he has

  • no experience with database administration; he's used Access and SQLserver, but never adminned one.

  • no experience with Linux sysadminning.

  • no experience with Python (all of our code is written in Python).

  • no experience with database-backed Web programming, although he has written CGI scripts in Perl.

He also has no biology background. Since this was the reason given to me for firing our old admin (who came to the job with all of the above computer skills, just no biology), clearly I was lied to about that.

Oh, and I'm the only person in the lab who currently has more than two of the above 5 skills. And I'm hoping to defend & leave soon. I certainly don't want to support our computer system or train someone new.

It is unclear to me what is really going on, but I have four hypotheses. (My fifth hypothesis is that it's all a big April Fool's joke, but I asked someone else & was reassured. So either they got the whole lab involved or...)

My hypotheses are:

  1. He's really, really cheap. (Doubtful - I don't think we were paying the last guy that much more than me, and I'm a graduate student.)
  2. They just wanted a new person and didn't think about it at all.
  3. They wanted a new person before I left, and wanted me to train him, and wanted to maneuver me into this situation.
  4. They just wanted a new person, and thought this guy fit the bill perfectly.

I think the last one is the most frightening for the future of the lab, because it implies active cluelessness.

Oh, and I also heard that my software (90% of our current system) was "idiosyncratic". Well, yes, it is. Unfortunately I don't think they were talking about my particular software development choices, I think they meant "it's not shrinkwrapped, so it's weird and unsupportable".

I guess I should feel glad that my advisor is helping to push me out the door by making the lab an unpleasant place to be.


p.s. ranting over. My apologies.

p.p.s. one more thing, actually. I wasn't consulted on the firing or the hiring, and upon asking about why we hired someone with no experience, I was told that he had lots of experience and I didn't know what I was talking about. ERGHHHHHH.

30 Jan 2005 (updated 30 Jan 2005 at 09:51 UTC) »
Quixote issues

Michelle Levesque built a Quixote app as part of the PyWebOff comparison of Python Web frameworks. One of her last complaints caught my eye. Essentially she couldn't figure out how to do access control the way she wanted.

The two complaints were that

  • (a) an AccessError exception (e.g. as raised by _q_access) couldn't easily be used to redirect/return a login page, and
  • (b) every page has to check permissions explicitly.

Since _q_access is called before every page, it's the right way to check permissions at the namespace level. The two problems can thus be solved in tandem.

First of all, organize the application so that the restricted areas are in a different namespace, e.g.

/             -- contains /login, welcome page, etc.
/restricted/  -- contains restricted pages

Then write a _q_access function in the 'restricted' module that raises a specific exception -- either a subclass of AccessError, or not, doesn't matter. In an application-specific publisher class, catch & handle this exception:

class MyPublisher(SessionPublisher):

def try_publish(self, request, path): try: return SessionPublisher.try_publish(self, request, path) except NotLoggedIn, e: return "you should log in"

In place of the "you should log in", you can return a redirect (which is what I would recommend) or else print out a page with the appropriate login form.

I admit this is neither the most intuitive nor the most obvious solution in the world if you're not familiar with Quixote, but it makes sense to me ;).

One thing that Michelle may have missed (and maybe it needs to be highlighted in the Quixote documentation or something) is that Quixote is all about namespaces. Organize things hierarchically -- either by object or by module -- and your Quixote apps will flow.

That is all.


42 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!