Thanks to everyone who's sent congratulations my way. It's
making me feel a lot more able to handle what sometimes
seems like a very daunting job.
Today was a relaxing family day (as are most Sundays). I
spent an hour this afternoon helping Alan put together an
ant farm. We don't have the ants yet, which is ironic
considering that this place is swarming with ants, but
they're of the "too small" persuasion.
Now that I'm focussing my time on Ghostscript, I have to
confront the fact that I won't have the time to pursue all
my various research projects. It never was the case that I
did have the time, but in the past I've been in successful
denial about it :)
See, one of my problems (among many) is that I have a lot of
ideas. Not all of them are good ideas, but I think I have a
somewhat better than average track record. In any case, the
rate at which I get ideas far outstrips my ability to
actually put them into practice.
One way to deal with this is to pick the ideas carefully to
spend the time realizing. I had a deep conversation
with mjs about this a week or so ago, when
we talked about the various styles of projects and which
ones are likely to bring happiness. Different people find
project happiness in different places, hallelujah! One of
the most satisfying areas for me is 2D graphics stuff, for a
variety of reasons (I'm tempted to expand on this but will
try to stick to my main point).
But not all of the ideas I have are for 2D graphics. My PhD
thesis work (which led to Advogato) is one notable
counterexample. I've been doing a fair amount of thinking
over the months about how to adapt it for different
applications. The most intriguing such application (to me,
anyway) is a content selection system for file sharing
networks, say MP3's.
I'm not going to go into the technical details here, but I
will outline how it looks from the outside. The simplest way
to implement it is as an Advogato-like website. You go there
and register for an account. Then, you identify other people
on the site who you think might like the same kinds of songs
that you do. You create certificates for them, just like on
Advogato.
At this point, the system is able to give you
recommendations. You download the recommended songs and
listen to them. If you like the recommendations, cool. If
not, you put in your own ratings and tell the system what
you really think. The rating could be simple, for example a
1-10 scalar. (more elaborate rating systems are certainly
imaginable as well)
So far, this doesn't sound too different than a lot of
rating systems. There are a few interesting properties that
I believe are unique, however.
1. Capacity-constrained flow is what makes Advogato
resistant to attack. In the case of Advogato, this means not
accepting large numbers of people who are not free software
developers. In the case of MP3, selection, it means
filtering out spam, buggily encoded tracks, and adulterated
tracks put in by people trying to break the system.
2. Most content selection schemes suffer from the "top 40"
effect, ie the most popular items are the only ones that
tend to show up. Flow, I think, can be used to help find
diamonds in the rough, something that existing content
selection systems do very poorly imho.
3. A web site is only one way to implement this idea, and
perhaps not the most interesting. The flow algorithm I have
in mind uses entirely local computation and communication
(ie, each node only has to talk to its neighbors in the cert
graph). Thus, it should be possible to do a completely
distributed implementation.
I believe that by far the most interesting file sharing
network to use in conjunction with this idea is Mojo Nation.
For one, Mojo Nation files are reasonably persistent and
have unique id's. This allows you to associate a ranking
with a song and have a reasonable chance of downloading that
song when you want. With Napster and Gnutella, the songs
available at any given time fluctuate depending on who's
actually plugged into the network.
In addition, Mojo Nation gives you a fair amount of the
infrastructure you'd need to do a completely distributed
implementation. Now that there are serious legal challenges
to links, having the content selection system completely
distributed could turn out to be very important.
In case it isn't obvious by now, I'd like a collaborator,
someone who can take the idea (in the form of a pretty good
writeup) and make a real implementation. In return for your
work, you'd get a pretty good understanding of trust
networks, a good chance at being involved with a high-impact
project, and coauthorship on a paper, if that sort of thing
moves you. In fact, this would get you an Erdos number of 5
:)
Assuming this flow-based system works well for MP3
selection, I believe it can generalize to spam-resistant
messaging (ie email). That is a somewhat harder problem,
largely because you want to try really hard not to drop
deliverable mail on the floor. For MP3's, simply having most
of the good recommendations come through is good enough.
I'm trying this form of collaboration in one other area
right now - dithering for inkjet printers. I've given Thomas
Tonino of the Gimp Print project some of my code and notes
for making dither matrices, and so far the results seem
promising. It seems worth trying, in any case.