The recent Xbox hack provides further evidence for a widely held belief in hacker circles: real DRM is technologically impossible, at least without huge improvements in the ability to produce bug-free software. Zooko writes passionately that the bad guys may well be winning anyway.
Indeed, the important distinction here is whether freely accessing digital content is convenient or merely possible. If a DRM scheme makes access inconvenient for most people, then it has largely succeeded in its economic goals.
Technologists, and free software hackers in particular, should be careful not to underestimate the importance of convenience. Over the next few years, I think that one of the most compelling applications for free software is a media player that just works, especially without stumbling over harebrained DRM schemes. The underlying technology is mostly here now, including the ability to efficiently move bulk files around over consumer Internet connections. But unless it's easy enough for your mom to use, it won't have much impact.
Essence of XML
I got my POPL proceedings a few days ago, and found Jerome Simeon and Phil Wadler's The Essence of XML to be one of the most interesting papers. He writes:
So, the essence of XML is this: the problem it solves is not hard, and it does not solve the problem well.
In particular, the paper is about type systems for XML, which include DTD's (widely recognized as underpowered), XML Schema, and other proposals. The paper goes into XML Schema in some detail. A central observation is that, while XML Schema obviously intended to provide for unambiguous schemas, it failed to acheive this goal.
The W3C produces mediocre standards. Not so bad as to be unworkable, but certainly not crisp and beautiful either. In many ways, this is better than anarchy, because at least they are standards, and there is a ton of code out there to deal with them. Lisp S-expressions may be prettier, but there are still plenty of details you have to nail down for compatibility, including choice of charset, dealing with string quoting, and so on.
mod_virgule
By popular demand, I've started hacking up RSS export for diaries. It's a little harder than I thought it would be. There are two bits to get past to make it validate, then perhaps some impedance mismatch. One of the bits is conversion from ISO-8601 dates (which is what mod_virgule uses internally) to RFC 822. Another is conversion of relative URL's within diary entries to absolute. Both of these are SMOP's.
The impedance mismatch is that Advogato diary entries don't have a designated title field, and the "description" can be very long. Many people (myself included) follow the convention of titles in <b> tags, but it seems dangerous to rely on this for structural information.
Probably the thing to do is just export the full entry as the RSS <description> for now, and gradually move to the option of more structural markup. I've been wanting to do something similar to provide summaries of mid-rated diaries in the recentlog anyway.
Thanks to dyork's reminder, I've updated the Wiki intermap. Also, I see that Gary did some code towards displaying multiple entries from a single poster in the recentlog. Hopefully, we'll be able to get at least the minimal amount of maintenance done soon.
The last couple of weeks have been really hard on my productivity. I feel like I've been getting behind on a bunch of things, including design and coding work on Fitz, the IJS 1.0 spec, a command-line version of the trust metric, and other things.
I'm feeling a bit more productive now and hope to catch up over the next few weeks.
Proofs
During times of stress, I find it comforting to muse on proofs. The idea of mathematical certainty, is soothing to me.
Much of my thinking is directed towards a scheme for portable and modular proofs. For one, there are many different axiom systems, of various strengths. Most proof systems simply choose one. The problem with this is that proofs can be ported to a system with a stronger axiom system, but not in general to a weaker one.
Further, if you have a minimalist set of axioms (such as second order arithmetic or Zermelo-Fraenkel set theory), then you want to construct a rich library of objects (many flavors of numbers, sequences, functions, etc.) on top of it. In many cases, there will be more than one viable construction (for example, reals can be infinite binary expansions, Dedekind cuts, or the Harrison's clever HOL construction). Proofs shouldn't depend on the details of the construction used. A proof over the reals should go through exactly the same no matter which construction of the reals undergirds it.
So I've been thinking along the ideas of modules and interfaces. The axioms of complex arithmetic would be one example of an interface. A proof over complex numbers imports this interface. A module representing a construction of complex numbers would import the HOL primitive axiom interface and export the complex number interface.
Each module can be checked individually, to make sure that the exports are justified in terms of the imports. Then, you can check a whole pile of modules, by instantiating the abstract interfaces in each "import" and "export" with concrete replacements. For example, the abstract addition function is replaced with a definition appropriate to the chosen construction. The whole thing checks if each import (after instantiation) is satisfied by a matching export (again, after instantiation) from a previous module (ie, no import cycles allowed).
Thus, you could fairly easily check a proof over complex numbers in any one of the axiom schemes powerful enough to represent them. Just supply the construction appropriate to the primitive axioms.
Not all proofs will check in all axiom systems, of course. In general, a proof module should be conservative in what it imports, so that it will check in the largest range of axiom systems. This principle also ensures that proofs can be ported to the widest range of other systems.
I hope to write about these ideas in more detail, including why it's important. It's obvious to me, but other people seem to need convincing. That sounds quite a bit like Ivan Sutherland's recipe for successful research: do something you think is easy, but everybody else thinks is hard.
28 Mar 2003 (updated 28 Mar 2003 at 06:37 UTC) »
If you want the Mandrake 9.1 ISOs:
Joining the torrent is especially appreciated if you have good bandwidth and are not behind a NAT. In the early phases of the torrent, downloads will be a little slow (20kB/s), but it should pick up in a couple of hours. If you can leave your BitTorrent application open even after the download is complete, that would help even more.
Feel free to spread the word; the more people who join the torrent, the better it goes.
This can be seen as a trial run for the RedHat 9 ISO release.
Red Hat has announced that ISO's will be available to paying customers on March 31, and on their public FTP server a week later. I consider this a fabulous opportunity to bring BitTorrent to the public attention by showing what a good job it can do with the hosting. Relying on public mirrors will be frustrating, tedious, and probably slow. BitTorrent can deliver excellent latency, bandwidth, and reliability. Are you interested in helping Bram set this up?
MPEG2 interlacing
graydon wrote me with a link to the MPEG2 work of Billy Biggs. His observations match what I've seen as well.
Billy observes that the interlacing codes on DVD's often don't seem to make much sense. In particular, source frames sometimes get sliced so that a single frame on the DVD interleaves fields from two frames. The main point of the RFF flag is to give enough slack to the encoder so that source frames boundaries and DVD frame boundaries line up.
There are two ways to look at the RFF flag. You could consider it a form of semantic information, identifying for each "picture" (meaning field or frame) whether it's interlaced or not, and if telecined, where the frame boundaries are. Alternatively, you can see the MPEG2 sequence on a DVD as nothing more than a compressed NTSC video source, with 2 fields in each frame, 29.997 fps. In the latter view, what the RFF flag buys you is a better compression rate. Duplicated fields need only get encoded once, and you don't have to DCT-encode frames with lots of high-spatial-frequency interlace patterns. Both help quite a bit.
Yet, DVD's have enough bits on them that most movies don't need to squeeze every last drop out of the compression. Thus, I'd guess that a lot of DVD's get encoded using heuristics to guess the RFF flags. So what if the heuristic gets a few frames wrong? It still plays fine, and hardly makes a dent in the overall compression ratio.
The problem, of course, is when people use the RFF flags for something else other than plain NTSC out. Examples include progressive-scan TV's (becoming popular now), playback to computer monitors, and of course transcoding. There, incorrect RFF flags can cause serious artifacts. Even so, since most DVD's get them mostly right, it's probably reasonable to use them even in these applications.
However, free tools (at least the ones I've seen) don't even do a reasonable job coping with mixed interlacing patterns.
I have hacked up these tools to provide reasonable pulldown when transcoding to SVCD. I instrumented mpeg2dec to output a file with one byte per frame, containing the RFF and TFF flags. Then, I hacked up mpeg2enc to get its RFF and TFF flags from this file, rather than cycling RFF on:off:on:off as is the standard behavior when the -p (--3-2-pulldown) option is set. The resulting files have good A/V sync and no motion artifacts, but the resulting setup is awkward at best, and when the source contains long runs of 29.997 fps frames, mplex complains of underruns. I set the (compile-time) option to ignore these, though, and the DVD player seems to handle them just fine.
The tools need to get fixed. I'm posting this largely to encourage that. However, it's not easy to just fix the code, as the real problem is in the interface between encoder, decoder, and intermediate processing. These tend to be all separate processes, connected with pipes. If that's going to continue, then the tools need to agree on a way to get pulldown flags into the yuv4mpeg format. The other reasonable approach is to try to knit the modules together as shared libraries rather than pipes. That seems to be the approach taken by OpenShiiva.
Home media
One of the big, huge potential killer apps for free software is to run home media centers, such as the ones that VIA is pushing with their C3 chip. With good support for non-DRM audio and video, and good p2p networking (such as BitTorrent), such a system could be overwhelmingly better than the crippled alternatives put out by mainstream corporations.
To this end, OpenShiiva looks particularly interesting. I haven't tried it yet, but it looks like it's addressing both quality and UI. The free MPEG2 tools I've played with so far have serious deficiencies in both departments.
One specific problem is that no free MPEG2 encoder I've seen can handle video sequences with mixed 29.997 fps and 24 fps 3:2 pulldown. The MPEG2 spec allows such mixing freely, through the "repeat first field" flag, which is independently settable for each frame. If it toggles on:off:on:off, it's 3:2 pulldown. If it's always off, it's 29.997 fps. Many DVD's mix the two, for example splicing a video-source animated logo on the front of a 24 fps movie.
Part of the problem is that yuv4mpeg format (used as the input to mpeg2enc) loses the RFF flag information. Thus, as you dig into the source code to tools such as transcode and mencoder, you tend to see a lot of crude hacks to work around the problem.
I've hacked up my local version of mpeg2enc to preserve the RFF flags from the source stream, with good results, but unfortunately the patches aren't general enough for production use; among other things, mplex'ing the resulting stream can result in underruns depending on the exact frame rate.
I have some notes on the pulldown issue that I'm planning on publishing as a Web page. Are there any MPEG2 encoder hackers who care?
Being nice
Zaitcev: passions are running high, on all sides. I ask you (and all Advogato posters) to be respectful of other people's opinions. There is no question that Bush is responsible for death and destruction on a large scale. Whether it's legal according to international law is one question. Whether it improves the situation for the Iraqi people is another (I honestly hope it does). Reasonable people can, and do, differ on these questions.
21 Mar 2003 (updated 21 Mar 2003 at 04:35 UTC) »
I attended the noon rally at UC Berkeley, followed by a peaceful protest in Sproul Hall, and am proud to report that I have a misdemeanor arrest on my record as a result. It was a disturbing show of force by the University and police.
None of the protestors were in the least bit violent, and we weren't even keeping people from their business in Sproul Hall (we were in the front foyer; people could still access all offices through the north and south entrances). Several speakers talked of treating the police officers with respect (one young woman has many police officers in her family), and were roundly applauded. The nonviolent legacy of Martin Luther King was invoked repeatedly.
Even so, for whatever reason the University saw fit to arrest 117 of us anyway. The protesters were peaceful, but the police pinched people, pried them away, and carried them out. I would find this quite understandable if we were violent (as were some of the protests in San Francisco), or if we were causing disruption any more serious than the lines during the busy season for financial aid, but for police to forcibly haul people away from a public building in a University feels very wrong. I wonder if the decision to make mass arrests might have been partially motivated by a desire to create more publicity for the anti-war cause. It seems more likely to be malice or stupidity, though, given the arrogant and patronizing attitude of Vice Chancellor Horace Mitchell, who briefly addressed us before the police began their action.
The SF Chronicle has a brief story on the event, and UC Berkeley has a press release. I haven't yet managed to find myself in any photos or video footage, but if you see me, let me know :). There are some other arrest photos I found. I managed to record some audio from a message left on Heather's cell phone.
For coverage of the protests in San Francisco, see Kevin Burton's report and photos, and Lisa Rein's incredibly moving blog coverage.
As I posted yesterday, I am taking two days off just to learn what I can about the war, meditate, and resist in whatever way I can. Tomorrow's entry will return to the normal format of mind-numbingly detailed writing about technical things I find interesting. However, I'll probably start up a personal blog so I can write about religion, politics, and other issues without having to worry about whether they're on-topic here.
Kids
We haven't really talked to the kids about the war yet. Alan wrote a blog entry last night. He typed the first three words himself :) Even more exciting, he got an inexpensive used digital camera for his birthday. I'm hoping that he'll want to post a few of his pictures as well.
Max loves playing with digital images on the computer even more than Alan does - he's had a great time exploring the zooming and contrast controls in iPhoto. Oh, and he can peel carrots by himself now too.
19 Mar 2003 (updated 20 Mar 2003 at 01:39 UTC) »
I have given notice that I will be taking two personal days off from work as soon as war begins, and I'll handle my free software community contacts the same way. War looks imminent, if indeed it hasn't started already. I'm prepared to march in San Francisco, and just need to coordinate with my family.
I'm on #war-news on irc.freenode.org, sifting through the reports.
UTC 2204: War has begun.
UTC 2218: There will be a potluck, meeting for worship, and vigil at the Berkeley Friends Church this evening, around 7:00 to 7:30. I expect to be there with the family. Update UTC 0001: This is a "called meeting" of the Quakers, not a general-purpose peace vigil. Non-Quakers are welcome to attend, but do keep in mind that it is a silent meeting. There is a potluck at 6:30, and the meeting begins at 7:30.
UTC 2229: An anti-war protester has died in a fall off the Golden Gate Bridge. NPR confirms air-strikes against surface-to-surface artillery inside the Iraqi border.</a>
UTC 2243: Firefight begins.
UTC 2251: Iraqi helicopters fire on Kurdish village.
It doesn't appear to be full-scale conflict yet, but it's certainly imminent.
UTC 2339: It appears that I may have jumped the gun a bit in asserting that the war has begun in earnest. The bombing in the no-fly zone is actually not that new - similar bombing has been going on for a while.
Indeed, there may actually be hope that the invasion can be stopped. A story in the Independent Online quotes an anonymous State Department official as being willing to make a deal for Saddam's exile. However unlikely this may be, it would save untold lives, be considered a victory for America and Bush, and probably save the life of Saddam Hussein and his family.
In any case, I continue to pray for this to play out with minimal loss of life.
UTC 2353: Kevin Burton has set up a "chump" to archive the URL's we're posting on #war-news.
UTC 0137 I've been listening to NPR and reading stories on the Net for about 3.5 hours, and am tiring of it. Whether in Berkeley or at home, I look forward to spending the evening with my family.
I haven't gotten much response to my last post on factoring codebases into smaller modules, but I have thought about the problem a bit more.
The first item is the desire to have a common runtime discipline that spans more than one module. The main problem here is that the C language doesn't nail down particular runtime decisions. In our case, the main things we need are memory allocation (one thing we need that standard C library malloc/free doesn't give us is a way to constrain total memory usage - for example, so that an allocation when near capacity causes items to be evicted from caches), exceptions, and extremely basic types such as string (C strings are inadequate because in many cases we do need embedded 0's), list, dict (hash table), and atom. A great many languages supply these as part of the language itself or as part of standard runtime, but C is not among them.
Of course, the fact that C doesn't nail down the runtime is in many ways a feature, not a bug. Different applications have different runtime needs, and a single general-purpose runtime is not always optimum. Perhaps more importantly, these richer runtimes tend not to be compatible with each other. In the case of Fitz, we need to bind it into Ghostscript (written in C with its own wonky runtime), Python test frameworks, and hopefully other applications written in a variety of high level languages.
In any case, with regard to the specific question of whether we're going to split our repository and tarballs into lots of small modules or one big one, for now I've decided to go for the latter, but with clear separation of the modules into subdirectories. That should preserve our ability to easily split into separate modules should that turn out to be a clear win, while making life easier for the hapless person just trying to compile the tarballs and get the software to run.
BitTorrent
BitTorrent absolutely rocks. Basically, it gives you a way to host large downloads (either large files, large numbers of downloaders, or both) without chewing up too much of your own bandwidth. Rather, downloaders share blocks with each other.
I think this has killer potential for Linux distributions and the like. I know most servers hosting RH 8.0 were seriously overloaded when that came out. I think that BitTorrent could be a far more effective way to get the ISO's out than standard HTTP/FTP. Of course, Red Hat probably won't push this, because much of their business model is founded on the relative slowness and inconvenience of public FTP servers as opposed to their pay service.
There's also a lot of potential into wiring BitTorrent into package downloaders such as apt and rpm. Some of the folks on #p2p-hackers think that WebRaid might be a better solution, but in any case I can see BT working well.
We're going to try distributing Ghostscript using BitTorrent, and see how it works.
These are legitimate (and very important) uses of BitTorrent, but it's most likely that the next big jump in popularity will come from other quarters. BitTorrent excels at serving up gigabyte-scale files with good performance and robustness, with minimal bandwidth and infrastructure needs. It shouldn't take a genius to figure out what this will get used for. The exciting (and scary) part is that Bram might soon find himself with millions of users.
Tor and I have been working a bit more on the Fitz design. We have to nail down a number of open decisions about coding style and the like. We don't want to cut a lot of code, and then find we need to redo big chunks. Our current draft is on the Wiki under CodingStyle. Many of the decisions are somewhat arbitrary, but even so aesthetics are important. We want to be able to look at the code with pride.
One of the most difficult issues is how to split up the code into modules. What level of granularity is best? The Ghostscript codebase tends to be fairly monolithic, and a large part of our goal is to refactor it into independent modules.
Clearly, a full featured PDF app will use all the modules, but it's also easy to imagine more lightweight clients that just use some of them. Perhaps an instructive example is a PDF validity checking tool (known as "preflight" in the graphic arts world). Such a tool has to parse PDF files and process the PDF streams, but need not actually construct a display tree for rendering.
One obvious approach is to make a giant hairball that contains everything. Clients just link in the library, and use what they need. There's little added complexity in the build and packaging processes, and there's no chance that the individual pieces will get out of sync with each other. However, it's not very elegant.
Another approach is to split everything into the smallest sensible modules. An immediate problem is that many of the modules will want to share infrastructure, particularly having to do with the runtime. For example, one of the things we're hammering out is a "dynamic object" protocol incorporating strings, lists, dicts (hashtables), names (atoms), and numbers. These kinds of objects show up all the time in PostScript and PDF documents, and are a handy way to pass parameters around. If we parse such an object out of a PDF file, and want to pass it as a parameter to the filter library, it would be really nice for the type to match.
So, in the "many small libs" scenario, I think there would be one base library ("magma") containing shared runtime infrastructure: at first, just memory allocation, exception handling, and dynamic objects, but possibly also loading of dynamic plug-ins and maybe threading support. All the other modules will allocate their memory and throw exceptions in a magma context, and pass around magma dynamic objects as needed.
The filter library would be the first such other module. It's small, very well defined, and will probably be quite stable once it's done. The Fitz tree and rendering engine would probably be the biggest, and see intense development over a period of time. Other obvious modules include a low-level (syntactic) PDF parser, and a higher level module that traverses PDF pages and builds Fitz display trees. We'd also need a module for font discovery (unfortunately, quite platform specific), and, eventually, one for text layout as well.
The problem is that support for packaging and versioning of libraries is generally pretty painful. There are lots of opportunities for these libraries to get out of sync, and many more testing permutations, especially if people are trying to use different versions of the same libs at the same time. Also, I worry that the fine-grained factorization might be confusing to users ('how come, in order to display a JPEG image, i use mg_ functions to create the JPEG parameter dictionary, pass that into an sr_ function to create the JPEG decode filter, and plumb the result of that into an fz_ function to draw it?') There are also some fairly difficult decisions about where certain logic should live. A good example is PDF functions. There's a good argument to put them in Fitz, but it's easy to imagine it in the PDF semantics module as well.
A related question is whether language bindings should be shipped as part of a library, or as a separate module. My experience has been that separate language bindings are often very painful to use, because of subtle version mismatches. The bindings tend to lag a bit, and they're much pickier about which exact lib version they're linked against than your average app.
There are other intermediate stages between the two extremes, but it's not yet clear to me whether any are clearly better. One such possibility is to have a single common namespace, but a bunch of smaller lib files so you only link the pieces you need. In the other direction, we could keep the source highly modular, with separated namespaces as above, but mash them all together into a single library as part of the build process (in fact, we'd probably want to do this anyway for Windows targets).
I know a lot of other projects struggle with the same issues. For example, 'ldd nautilus' spits out no less than 57 libraries on my system. Of these, glib corresponds fairly closely to the magma layer above, and is used by many (but not all) of the other libs. Perhaps coincidentally, many users find that building and installing Gnome apps is difficult.
At the other extreme, I've noticed that media player tarballs tend to include codecs and suchlike in the source distributions, often tweaked and customized. Mplayer-0.90pre8 has 11 subdirectories with 'lib' in the name. The advantage is that building mplayer is fairly easy, and that (barring a goof-up by the producer of the tarball) versions of the libraries always match the expectation of the clients. The disadvantage, of course, is that mplayer's libmpeg2 is not shared with transcode's or LiViD's. Also, it's harder to do something like install a new codec on your system that will just work with all the players.
Perhaps, over time, it will become less painful to distribute code as a set of interdependent libraries. In the meantime, we have to strike the right balance between keeping our codebases and development processes modular, and keeping life pleasant for users. I'm not sure the best way to do it, so I'd appreciate hearing the experiences of others.
FOAF updates: Trust rankings are now exported, making the data available to other users and websites. An external FOAF URI has been added, allowing users to link to an additional FOAF file.
Keep up with the latest Advogato features by reading the Advogato status blog.
If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!