18 Sep 2000 raph   » (Master)

Thanks to everyone who's sent congratulations my way. It's making me feel a lot more able to handle what sometimes seems like a very daunting job.

Today was a relaxing family day (as are most Sundays). I spent an hour this afternoon helping Alan put together an ant farm. We don't have the ants yet, which is ironic considering that this place is swarming with ants, but they're of the "too small" persuasion.

Now that I'm focussing my time on Ghostscript, I have to confront the fact that I won't have the time to pursue all my various research projects. It never was the case that I did have the time, but in the past I've been in successful denial about it :)

See, one of my problems (among many) is that I have a lot of ideas. Not all of them are good ideas, but I think I have a somewhat better than average track record. In any case, the rate at which I get ideas far outstrips my ability to actually put them into practice.

One way to deal with this is to pick the ideas carefully to spend the time realizing. I had a deep conversation with mjs about this a week or so ago, when we talked about the various styles of projects and which ones are likely to bring happiness. Different people find project happiness in different places, hallelujah! One of the most satisfying areas for me is 2D graphics stuff, for a variety of reasons (I'm tempted to expand on this but will try to stick to my main point).

But not all of the ideas I have are for 2D graphics. My PhD thesis work (which led to Advogato) is one notable counterexample. I've been doing a fair amount of thinking over the months about how to adapt it for different applications. The most intriguing such application (to me, anyway) is a content selection system for file sharing networks, say MP3's.

I'm not going to go into the technical details here, but I will outline how it looks from the outside. The simplest way to implement it is as an Advogato-like website. You go there and register for an account. Then, you identify other people on the site who you think might like the same kinds of songs that you do. You create certificates for them, just like on Advogato.

At this point, the system is able to give you recommendations. You download the recommended songs and listen to them. If you like the recommendations, cool. If not, you put in your own ratings and tell the system what you really think. The rating could be simple, for example a 1-10 scalar. (more elaborate rating systems are certainly imaginable as well)

So far, this doesn't sound too different than a lot of rating systems. There are a few interesting properties that I believe are unique, however.

1. Capacity-constrained flow is what makes Advogato resistant to attack. In the case of Advogato, this means not accepting large numbers of people who are not free software developers. In the case of MP3, selection, it means filtering out spam, buggily encoded tracks, and adulterated tracks put in by people trying to break the system.

2. Most content selection schemes suffer from the "top 40" effect, ie the most popular items are the only ones that tend to show up. Flow, I think, can be used to help find diamonds in the rough, something that existing content selection systems do very poorly imho.

3. A web site is only one way to implement this idea, and perhaps not the most interesting. The flow algorithm I have in mind uses entirely local computation and communication (ie, each node only has to talk to its neighbors in the cert graph). Thus, it should be possible to do a completely distributed implementation.

I believe that by far the most interesting file sharing network to use in conjunction with this idea is Mojo Nation. For one, Mojo Nation files are reasonably persistent and have unique id's. This allows you to associate a ranking with a song and have a reasonable chance of downloading that song when you want. With Napster and Gnutella, the songs available at any given time fluctuate depending on who's actually plugged into the network.

In addition, Mojo Nation gives you a fair amount of the infrastructure you'd need to do a completely distributed implementation. Now that there are serious legal challenges to links, having the content selection system completely distributed could turn out to be very important.

In case it isn't obvious by now, I'd like a collaborator, someone who can take the idea (in the form of a pretty good writeup) and make a real implementation. In return for your work, you'd get a pretty good understanding of trust networks, a good chance at being involved with a high-impact project, and coauthorship on a paper, if that sort of thing moves you. In fact, this would get you an Erdos number of 5 :)

Assuming this flow-based system works well for MP3 selection, I believe it can generalize to spam-resistant messaging (ie email). That is a somewhat harder problem, largely because you want to try really hard not to drop deliverable mail on the floor. For MP3's, simply having most of the good recommendations come through is good enough.

I'm trying this form of collaboration in one other area right now - dithering for inkjet printers. I've given Thomas Tonino of the Gimp Print project some of my code and notes for making dither matrices, and so far the results seem promising. It seems worth trying, in any case.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!