Older blog entries for raph (starting at number 195)

Web UI's and robustness

Yesterday, I linked a claim by Dave Winer that users like to use Web browsers as their UI, and a note of frustration by Paul Snively, who lost some work on his blog, and blamed the "Web browser as UI" approach. Paul has since retracted his original post, but I think there's a deeper issue that deserves to be explored. I've lost a fair amount of work posting to Advogato as well, so let that be the whipping boy rather than Radio.

Using the Web browser for the UI has a different set of design tradeoffs than a traditional desktop app. Some things get easier, others get harder. Lots of things can lead to a bad (or good) user experience, including things not foreseen by the software's designer. I know I didn't pay a great deal of attention to robustness when designing Advogato.

I'm not going to try to catalog all the advantages and disadvantages of Web-based UI's - that's a daunting task. Instead, I'll focus on robustness, or the risk of losing work.

I used to lose coding work fairly frequently. Now I don't, because I do lots of things to improve robustness. First, I use an editor with good robustness features. Second, I check my work into a remote CVS server. Third, I back up the repository to CD-R every week. I also frequently copy local working files to another machine on my network, and send patches to mailing lists. As a result, it's been quite a while since I've lost any coding work.

I still lose Advogato posts, though, most recently about a week ago. Why? For one, Netscape 4.7x's text entry box doesn't have any of the paranoia that real editors have about losing work. In fact, pressing Alt-Q (the same command as "format paragraph" in Emacs) is a quick shortcut for "lose all state and quit". This could be fixed in the client, but as the designer of Advogato, I don't have much say about that.

There is more I could do but don't, though. I could make the "Preview" button store the draft in a log, persistent for a day or so and then garbage collected. You could, if you chose, edit in the browser and click "Preview" regularly, much as I regularly press Ctrl-X Ctrl-S in Emacs. In fact, I rather like this idea, as it has other advantages. For example, you could share the draft link with others.

Similarly, I could implement something like version control for diary entries. When you post a new edit, it saves the old version (or, perhaps, the diffs) somewhere. Thus, if you clobber your work (as I did a few days ago), you can get it back. Again, there are other advantages to this approach. This is basically ReversibleChange, one of the "soft security" ideas popular in the Wiki world.

A very high-tech client might even be able to implement something analogous to the autosave feature of Emacs. It would periodically upload the current state of the text entry to the server for safekeeping. However, this would require some pretty major changes to the way the Web works, so I'm not holding my breath.

In the meantime, there are alternatives. For one, it's possible to use an XML-RPC client instead of a Web browser. Many of these clients encourage you to use your favorite editor, which helps with robustness. The client could also keep a log of everything submitted. Such an approach would be complementary to the server-side tweaks I mentioned above.

In the meantime, I now generally write my diary entries in Emacs, then simply cut-and-paste into the browser. It's not perfect, but it works.


A Quaker friend of ours, Ricki Anne Jones, invited us (Heather, the kids, and I) to today's A's game. She gets a luxury box every year because she's such a loyal fan, and she invites a bunch of her friends. It was fun. Alan and Max had another kid to play with, and they avoided melting down, so this was the first time they'd lasted through the entire game.


Dan Brickley of the W3C showed up on #p2p-hackers tonight. We talked about AaronSw's manifesto, among other things. It was a fairly pleasant conversation, but I still feel that the W3C is pretty badly broken. Dan encouraged me to write up some my Athshe stuff (I was trying to talk about it online). As a result, my next Athshe blog will be "why Athshe will not be based on XML."

New computer

I ordered my new dual-Athlon from KC Computers. I'll keep a running log, especially in case other people want to follow the same recipe.

The final price was about $1600. Picking the absolute bottom price from Pricewatch, the parts add up to around $1000, not including assembly and shipping. I consider it money well spent, because I figure I have a considerably lower risk of something going wrong and eating up lots of my time.

Even so, I've been quite satisfied every time I've bought something off Pricewatch, even when the prices seem too good to be true. Good reputation information about the sellers (not unlike what ebay does) would seem to decrease the risk even more.

There are lots of things that can go wrong. The seller could turn out to be shady. The parts could turn out to be defective, possibly seconds or returned merchandise. The parts could be completely legitimate, but of low quality (like the infamous IBM Deskstar 75GXP drives). They could be individually ok, but subtly incompatible with each other, apparently a very serious problem with early Athlon platforms. They could be just fine, but not well supported by the operating system. This last problem has a wide range of variability, as it varies depending on the OS flavor. Something like recompiling the kernel to update a driver may be perfectly reasonable for a sophisticated user, but out of reach for others.

In all these cases, good metadata could help. If I knew I was getting parts with a good chance of working well in the system, I'd have no problem with going through Pricewatch-style vendors, and wielding the screwdriver myself.

Such a metadata system could be quite high-tech. For one, it could compute total cost including aggregating of shipping, and sales tax. It could take shipping delay into account, as well. Optimizing this sounds like a dramatically scaled-down version of the ITA fare calculation used on Orbitz. It's appealing because it maps to minimizing labor and energy costs in the real world, not just getting the best outcome of a game.

You could also do stuff like autogenerating recipes (select this BIOS option, set hdparm to that, etc), and incorporating feedback from others with similar configurations. Even more extreme would be to customize a distribution. Custom kernels, in particular, seem like a win.

A huge part of the value of a brand (such as IBM or Dell) is the QA work they do, essentially creating metadata about reliability and compatibility as part of building and delivering systems. Even so, the assurance is far from absolute. For example, IBM Thinkpad 600's have a defective battery design, causing them to die too early. Metadata from TP600 owners may be more useful input to a buying decision than "IBM is a good brand".

Another reason to believe that a high-tech metadata is useful is the huge variability in the needs of users who run free software. One size most definitely does not fit all. This, I think, is one reason why companies such as VA Research have had such a difficult time being competitive.

There was a lot of talk about "mass customization" being part of the new economy, but not much follow-through. Most dot-com retailers were little more than mail order outfits that happened to publish their catalog through the Web rather than on paper (in fact, many have since added paper catalogs to their mix).

I'm certainly not going to put this kind of metadata system together myself, but I do think it would make an interesting project for someone. Clearly, this type of service would be worth real money. I'm not alone in believing that metadata is important. Amazon has very high quality metadata about books, and that's why their site is so valuable.

26 May 2002 (updated 26 May 2002 at 08:50 UTC) »
Advice to young standards authors

aaronsw posted The Standards Manifesto a few days ago. In it, he expresses frustration with the W3C. I can certainly identify.

I've had a fair amount of experience with standards processes over the years. I'm sure I'll have more. Most, but not all, of the experiences have been unpleasant. Standards are incredibly important, but not much fun. Standards committees, in particular, tend to be particularly tedious, and not very smart (even if the individual members of the committee are). In fact, standards committees are in many ways the least qualified entities to be writing standards.

Designing good standards is an art, and an underappreciated one to boot. One of the most important quality metrics is complexity. A standard should be as free from unneeded complexity as possible. The cost of a spec goes up by roughly an order of magnitude from spec to prototype, and again from prototype to production-quality implementation. It's too easy for glorified technical writers to come up with a new feature if they don't have to implement it themselves.

Standards reflect the process that creates them. The biggest problem with standards processes is lack of openness. There is a spectrum, ranging from complete transparency (the IETF at its best, where everybody is welcome to show up at meetings, and "rough consensus and working code" carries the day), to evil cartels such as the DVD CCA and the BPDG. In these extreme cases, only members are allowed to participate, only members are allowed to see the spec, and there are NDA's and all kinds of legal restraints to "protect" implementations of the standard. The W3C is somewhere in the middle of this continuum. In general, you have to be a paying member to participate, the deliberations are private (and NDA'd), but the resulting spec is public, comments from the public are solicited, and there are (usually) no patent royalties. The W3C seemed to be headed down a darker path, including promotion of patent licensing, but to their credit they responded to public outcry and backed off.

It is often said that "the great thing about standards is that there are so many to choose from." I propose that the same is true of standards processes. Small, motivated groups (or even individuals) can and should make standards. In fact, their work is often the most important. Examples abound. While there was a standards committee around JPEG, the really important work was done by the IJG (mostly Thomas Lane), which standardized a patent-free subset of JPEG and produced a very high quality free implementation. Sockets, developed by Bill Joy in the late seventies and quick to become the universal API for the Internet, were ignored by standards committees until just a few years ago. Committees tended to favor things like XTI, now mercifully dead.

Standards bodies are reasonably good at codifying existing practice. They suck at doing research. A good process winnows needless complexity from a standard, focussing on the essence of the problem. It's almost a quantitative science, as different approaches to the same problem may differ quite significantly in complexity.

A naive person might assume that building a global information network, scaling from coke machines to supercomputers, would be a harder problem than, say, synchronizing audio and video in multimedia. Yet, the TCP/IP documents (RFC's 793 and 791) weigh in at about 128 pages and are reasonably complete, while SMIL is about 500 pages, and includes lots of other standards by reference.

The economic incentives for closed, proprietary (and complex) standards are powerful. Who would spend grueling hours in a standards committee to help create a beautiful, free, simple standard, out of pure altruism? In fact, much of this work is similar to writing free code, but it tends to be quite a bit less fun.

I think the salvation lies in making the creation of new standards more fun. I'm not sure the best way to do this, but can offer my own experiences. The most fun I've had in a standards process has been IJS. It was in many ways a purposeful experiment in process. I deliberately narrowed the scope (things like UI and i18n are not included), while still trying to solve an important problem (making it easy to create printer drivers decoupled from the rasterization engine). I also acted as a dictator with respect to the spec. I didn't include suggestions from the mailing list until I was convinced that they were necessary, and properly thought through. Another key part of the process was the reference implementation, not merely free but also designed explicitly to adapt easily to other people's drivers rather than impose my own framework.

Also important was the fact that IJS built on the work done by HPIJS, a working protocol and codebase that merely had a few portability issues and some aspects specific to HP printers. I didn't have to take on a whole research project.

IJS is of course not perfect, but I do think it's doing its job. My time working on it is justified in the improved user experience and decreased support load, and it was kinda fun. (see, it wasn't altruism, it was enlightened self-interest :) The next time I get involved in a standards process, it's going to look a lot more like IJS than a W3C working group.

So, my advice to Aaron? First, by all means seek a better process than the W3C's. I have no doubts that such a process can be found. Second, be clear on whether the task is primarily research into what should be standardized, or codifying a well-understood domain. In the former case, it makes sense to just go do stuff. In the latter case, finding consensus is more important. Third, strive for simplicity, Feel especially free to ignore those who wish to add their own pet complication, especially if they don't share your vision.

Last, and perhaps most important, treat it as something that's supposed to be fun. Don't get too emotionally wrapped up, especially over the question of whether the rest of the world is getting on board. If you create something really good, it will have an impact. If the standard is simple, it will be expedient for others to implement.

Selected quotes from other blogs

From Dave Winer's Scripting News:

Anyway, we went far and wide and swung around to desktop websites, a subject near and dear to my heart. He wondered why more Mac developers weren't using the combo of PHP and Apache that comes bundled with every Mac. I think it's just a matter of time before Unix developers get there. Users like apps that run in the browser.

From Paul Snively's Gadfly Redux:

OK, this is twice now that I've had a ton of news queued up, posted some things, and... *poof*. Dozens of news items gone. Thank God I know that three pressing ones are actually comments from Paul Prescod and are safely ensconced with YACCS. But there are stacks of other things I wanted to respond to, and they're gone.

You know, I try to be patient. I try to be reasonable. I get a chuckle out of it when Dave says in best ha-ha-only-serious fashion that they write software that sucks. But Radio has a combination of serious architectural flaws: a browser interface that allows the browser's notion of object identity and the database's to get out of sync, possibly due to browser page caching and navigation using the back and forward buttons; and the lack of transactions in the underlying database. Sooner or later, this combination will result in what I've now seen twice.

I'd be a lot happier if Radio would just be an ordinary desktop application. Editing in the browser isn't a win, especially on the Mac. I'm totally with the local flyweight server idea. It's just that I want a well-integrated, rich client to go with it.

I myself have lost plenty of work editing in a browser. Here, I think, we have the classic quality tradeoff between a universal client, and one optimized for a specific task. This is one of the reasons I'm so happy to see all the client work happening around Advogato :)


Heather got the paper with the gold seal on it. We went out for sushi to celebrate.

New machine

I got a bunch of recommendations for gettng a new machine built. Thanks, Zooko, wmf, and Krishnakumar B. Any of the recommendations I got would be a better deal than fiddling with it myself.

Creative Commons

The Creative Commons looks both interesting and useful. I'll probably release some stuff (other than software) under one of their licenses.

Link encryption

For fun, I've been looking at link level encryption techniques. TLS (formerly SSL) is surprisingly not too hairy. The main thing wrong with it is lots of options, and X.509 certs. The latter, of course, are a big turnoff.

One of the (numerous) gotchas is the "side channel" attack against RSA encryption with PKCS #1 padding. Basically, if the server will tell you whether the decryption of an attacker-provided RSA integer has valid padding, that will reveal bits of the private key. The way to avoid trouble is to not return an immediate error indication. Ideally, you just treat an invalidly padded decryption as random, and wait for a future message authentication to fail.

One of the major issues in link encryption is to get both privacy and authentication (which implies integrity). It's easy to get this wrong. In fact, all of the traditional cipher modes such as CBC, CFB, etc., allow some systematic modification of the decrypted plaintext without necessarily being detected.

Thus, conservative designs (including TLS) generally apply both an encryption step and an authentication (MAC) step. TLS does MAC and then encryption, but a relatively recent paper by Bellare and Namprepre proves that doing encryption first, then MAC, is somewhat stronger.

I really like the paper, because it spells out pretty clearly what's secure. I believe that the traditional rules of thumb for choosing a cipher mode (such as the discussion in Applied Crypto) go out the window. All of the privacy-protecting modes (including CBC, CFB, and counter aka CTR) are equally good. CTR, in particular, has gotten a bad name because it doesn't even try to do integrity protection, but it's nice in many ways: it parallelizes, it's simple, and the proof of security is pretty simple too. This paper makes a nice case for it.

The preferred MAC algorithm these days tends to be HMAC, which is just about as fast as the underlying hash algorithm, and is secure against extension attacks. If you want a conservative recipe, CBC or CTR plus HMAC seem like good choices. Another choice is a CBC-MAC. The advantage is that the security is well understood, and you have one less primitive to rely on. If the encryption is sound, your authentication will also be sound. The disadvantage is that that a CBC-MAC based on AES is somewhere around half as fast as SHA1.

A lot of people are trying to do encryption and authentication in one go, to save the time of doing the separate MAC step. It turns out to be a very hard problem. Quite a few of the proposed modes for AES fall into this category. Of them, IAPM and OCB seem to be the most appealing. Unlike the many failed previous attempts (including the NSA's own DCTR mode), they have proofs of security. In addition, they're parallelizable, unlike traditional MAC's. The big problem is that they're patented.

I can't figure out whether ABC (accumulated block chaining) is any good or not. It claims to do a good job on authentication, but in its simplest form can leak bits about the plaintext. They have a fix (a 1-bit rotation), but I'm not smart enough to figure out whether it's as secure as possible. Another disadvantage is that it's just as unparallelizable as CBC.

It's interesting that most of the academic literature deals with a single message of known size. The link encryption problem is subtly different. Typically, you have a stream of messages, with a length prefixed to each. You want to check the authenticity of each message, and also prevent replay attacks. The patent free "generic composition" mode advocated by Rogaway comes up a bit short. In particular, you can't decrypt a block unless you know whether it's the last block in the message. This could be a problem if you want to allow very short messages. You could of course specify a minimum message length (in which case the first block would never be the last block, and after decrypting it you have the length field so you know where the last block is), or try the decryption both ways. Neither is all that satisfying.

In addition, having a length prefix lets you avoid a lot of the complexity that gets added to avoid extension attacks. If you were trying to do link encryption as simply as possible, patent-free, and with conservative design, I propose the following recipe. Use AES in CTR mode to generate a pseudorandom stream. For each message in the sequence, prepend the length, and then XOR against this stream. This is your ciphertext for the message. Take the SHA1 hash of (the session key) + (a sequence number) + (the ciphertext). This is your MAC tag (actually, you only need the first 96 or 128 bits of it to attain reasonable security). The sequence numbers can be byte counts or message counts; the essential thing is that they never repeat. The sender sends alternating message ciphertexts and MAC tags. The receiver decrypts the length header first, then decrypts the rest of the ciphertext and computes the same MAC tag as the sender. If this tag matches the one sent in the stream, the message is valid.

Of course, don't take my advice for it. In most cases, just using a good TLS implementation is a better idea than trying to handroll your own. But I had fun learning this.

Soapy (in)security

I got this email from Paul Prescod:

Don't be so sanguine about SOAP security. It could have been designed NOT to use port 80 and NOT to look like HTTP traffic. It was clearly designed to camoflage itself as Web traffic. It would have been easy to send it over sockets and another port. You say that any solution would have security issues. Well yes, everything networked has security issues, but SOAP's very generic RPC design encourages problems like the SOAP::Lite dispatch problem. Some people are writing programs to wrap any COM object in SOAP. Standard application protocols like HTTP and SMTP strongly encourage you to build an abstraction layer that maps from the worlds they talk of "resources" and "mailboxes" to the implementation world: "objects". SOAP does not encourage the introduction of that mapping layer. In fact, standard SOAP/WSDL tools encourage you to take existing code and "generate" a SOAP/WSDL interface to it. Three examples: Visual Studio.NET, axis "java2wsdl" and SOAP::Lite.

Well said. People who know something about security are saying that the nature of Soap makes it especially fertile grounds for security vulnerabilities.

I said, "I believe Dave when he says that Soap wasn't explicitly designed to go through firewalls." But my point here is that it doesn't matter. The intent of the designers is irrelevant, unless they did careful work to make Soap security as good as it could have been. I see no signs of such work.

New machine

I need to buy a fast machine, in large part to run the Ghostscript regression tests. By far the best bang for the buck appears to be a dual Athlon. A Tyan Tiger MPX motherboard, two MP 1900+ Atlha, half a gig of RAM, and a disk costs less than $1000 in parts, as listed on PriceWatch.

It's been a while since I put a machine together myself. I'm a little apprehensive, because I know it will take a nontrivial amount of time, and there's a chance it won't work right when I do get it together. I'd rather pay somebody else to do it. Unfortunately, the big name PC's don't do a good job providing this kind of machine. Is there a smaller shop that can do this? Last time around, I bought a dual Celery from The Computer Underground, which was great, but they're not in the box biz anymore.

Alternatively, is there an Advogatan who can do a good job building such a box and putting Debian on it, and who'd like to make a couple hundred bucks?

Otherwise, I'll just order the parts and hope for the best. It doesn't look that hard, and I can probably get some help on the hardware side locally.

I'm also interested in hearing recommendations and anti-recommendations from people who have gone this route. My friend at the coffee shop suggests a water-cooled case as a way of cutting down noise. Is this reasonable?

Ok, the response to my bit on Soap security was pretty negative. I didn't communicate what I was trying to say very well. Apologies.

Soap is probably not any worse than CGI or other Web things. But my point is that it's not any better, either. I guess it all depends on whether you think that's good enough. (I don't.)

Firewalls are pretty useless. The main security implication of going through port 80 is that it will eliminate the need to explicitly open a Soap-specific port in the firewall, in the off chance it wasn't left open anyway. So talking about Soap and firewalls isn't very enlightening. In any case, I believe Dave when he says that Soap wasn't explicitly designed to go through firewalls.


This afternoon, I helped Heather print out the final paper copies of her dissertation. Tomorrow, she hands it in for binding in the library and microfilming. The entire time I've known her, this has been a goal to work towards. Now it's actually done.

For historical reasons, the thesis is in troff format. I consider myself a smart guy, but fixing things was a usability nightmare. I had to rely on a mix of man pages, "helpful" web tutorials, a book on troff, and the groff s.tmac source. Even then, I ended up hand-tweaking a bunch of stuff. Ugh.

I helped format some Chinese hanzi as illustrations. I feel absolutely terrible about the fact that I used a Japanese font even though the context was Chinese, but didn't want to spend the time to redo all the eps's. Fortunately, all of the hanzi are quite similar visually to the corresponding Japanese kanji, so it's pretty unlikely that anyone will notice, much less care. rillian accuses me of perpetuating Unicode FUD, though. I guess I'll just have to live with that on my conscience.


Alan is resisting doing homework with me, and also resisting reading I'm not sure why. He has a lot of anxiety, and these days seems to be expressing it a lot more.

Max loves cooking. Tonight he helped me make a stir-fry. He had a little difficulty stirring, and decided that he needed a "big" spatula. When I told him that it was a big spatula, he asked for "giant". I thought it was cute, anyway :)

It's fun to play the "can you say X" game with both kids. For Max, the edge is currently "peanut butter jelly belly". A week or so ago, he had trouble saying this, usually dropping one of the words. Alan has no trouble with "uvulopalatopharyngoplasty". Finding harder words is going to be challenging :)

Expensive infrastructure

I'll try my best not to make David McCusker's head explode tonight.

His IronDoc and my Athshe are both examples of "expensive infrastructure." To do a good job implementing the ideas requires a lot of complex code.

David again makes the crucial distinction between interface complexity and implementation complexity. Ideally, you have a system which is easy to use, but might do very smart, sophisticated things under the hood. In the best case, doing things well at a lower level actually makes things simpler in the higher levels.

Filesystems are the perfect example. The basic interface is pretty simple. In fact, files and directories are the basic concepts exported to the UI, even in systems designed for novice users. Yet, you have implementations such as XFS, which weighs in at 140 kloc or so.

Simpler implementations are possible, of course. In fact, the only thing that could possibly justify this kind of complexity is to be a lot better. And if you compare a modern filesystem implementation against, say, the original Minix filesystem in Linux (about 2000 lines of code), there is no contest. Performance, scalability, and robustness have all improved dramatically over the years.

Yet, that original 2 kloc implementation was important. If you had told Linus that he had to implement a highly scalable journalling filesystem before he could release the Linux kernel, the project would have been dead in the water. As the kernel became more popular, however, the payoff for more sophisticated implementations became clearer. For one, a better filesystem implementation improves matters for all of the applications (totalling many millions of lines of code, no doubt) running on Linux, without making those apps any more complex.

What is the "right level" of complexity for something like IronDoc? I'm not sure. It's not popular enough yet for the amortized payoff argument to work. This is a bit of a Catch-22; it won't become popular until it's reasonably good. The usual way around this dilemma is to build a much simpler prototype first.

I think this kind of "expensive infrastructure" is a good place for highly talented programmers to spend their time. It is a theme which runs through quite a bit of my work, including, today, Ghostscript and the continuing evolution of its graphics library.

Soap and security

Security experts such as Bruce Schneier and Alan Cox are rightfully concerned about the security vulnerabilites of new systems built on protocols such as Soap. At his ETCon talk, Schneier quipped that Soap is "firewall-friendly" in the same way that a bullet is "skull-friendly".

Dave Winer's response is pigheaded. In fact, his new "standard response" says almost nothing about security.

Dave has asserted that getting through firewalls was not an explicit goal of Soap. Could be; doesn't matter. The fact is that it will be used to get through firewalls. More precisely, firewall configurations based on Web assumptions will be used to filter traffic that breaks these assumptions. In general, you have to look pretty deeply inside a Soap message to figure out how much (if any) damage it can do.

Firewalls are only part of the problem. Expect a flowering of attacks that simply request things through Soap unanticipated by the designers of the system. With highly dynamic scripting languages, it's even harder to imagine all the ways that a clever attacker might put together the pieces made available to him; I am sure that the SOAP::Lite security hole was merely the first of its kind to come to light.

In theory, it should be possible to create secure services based on Soap. In practice, it's just not going to happen. As Schneier argued so well in his talk, increasing security is expensive. In addition to the labor and expertise needed for audits, paying attention to security slows down development, curbs features, and generally makes software less competitive in the market. The nature of the software industry guarantees that products will be developed and shipped with security holes.

I'm not saying that Soap is a bad thing. Indeed, for many applications, the advantages surely outweigh the disadvantages. Also, if you try to build those same applications with competing techniques, there will no doubt be security vulnerabilities from those, as well. What I am saying is that people should be conscious of the security problems that history and reflection tell us will occur.

I met Dave briefly at ETCon. I'm sorry we didn't have a chance to chat more.

Trees, again

A general suite of format/protocol/api for trees such as Athshe is a pretty complicated piece of code. McCusker's blob spec alone, is daunting in its complexity. I'm trying to get a sense of where the complexity is coming from. Obviously, some of it is optimization, for example trying to maximize leaf occupancy, or minimize the number of mutated nodes in a copy-on-write case.

McCusker asks what I mean about network protocols and file formats. His observation that disk access and network latencies are similar is spot-on. I am not really talking about tuning the optimization. My main concern is concurrency. A network protocol must deal with locking and change notification. Transactions would be nice too.

Consider the read-only case first. Concurrency becomes a non-issue. In fact, a perfectly good protocol is to request blocks from a btree file format. If the block size divided by the bandwidth is comparable to the request latency, then you're keeping the connection reasonably full (I'm not worried about small factors).

If you do care, you can hide the latency in a number of ways too, including prefetching. If the client task is to fetch a complete subtree, it can pipeline several requests, hiding almost all latency without the risk of fetching unneeded blocks. However, this requires asynchrony in the client API, a source of complexity. Such is the tradeoff.

I'm not going to go into the details of adding network writes tonight. A simple locking discipline, with caching in the kernel, might work well in a concurrent local disk case. The client drills down from the root every time it accesses the tree, but these requests will usually hit in the cache. If you did something similar over the network, latency could kill you.

One approach is for the server to invalidate blocks (possibly) in the client's cache. The response to a write lock request guarantees that all such invalidations have been sent (we assume message ordering). The bandwidth for the invalidations is trivial compared with the blocks themselves. The main cost is the latency round-trip for the write lock. Also, contention for the write lock could be a serious problem.

A much more sophisticated approach is to optimistically send write requests under the assumption that the relevant blocks are still valid. The server checks, and only commits if so. Invalidations are still useful as a way of improving the chances of a valid commit, but are not needed for correctness. In any case, cleaning up in case the optimism was unwarranted is a challenge.

Cache invalidation is related to change notification, which is the stronger property of actually notifying the application when a node has changed. Change notification is essential for the Model/View paradigm, as it is what triggers recomputation of the View. Doing it generally and efficiently is a hard problem, but one that has been studied in detail. Many new designs include some ad-hoc form of change notification. Polling is very simple and robust, but is not very efficient, especially if you care about notification latency. Thus, you see crude things such as the <cloud> interface in RSS (and OPML). The prospect of solving the problem well is a good motivation for Athshe.

I'll write about these ideas in more detail later.

20 May 2002 (updated 20 May 2002 at 20:18 UTC) »
Doctor Mommy

Heather's PhD graduation ceremony was today. She has two of three signatures, and still needs to submit an archival copy, but otherwise is basically done. I am so proud of her.

The valedictory speech by Daniel Hernandez (also known as the editor of the Daily Cal) was very good. Prof. Christ's speech was also good. Prof. Adelman's speech was not bad, but had a somewhat defensive tone regarding the importance of English vs other more ostensibly practical pursuits, including a potshot at computer science.

Heather's father and youngest sister were there, as were the kids and I. Also, Marty and Jan. Jan seemed offended that I shook her hand in parting (as opposed to a hug). Oh well.

Low tech trust vs. high tech trust

I had an interesting discussion with Roger Dingledine and Luke (hacker-name armitage) over lunch at ETCon about the importance of manual input into attack-resistant trust metrics. Advogato uses the explicit,

[shit - this part of my entry got deleted - anyone have a cached version? Add "retrieve previous version" to m/ wishlist.]

Outline follows:

Trust metrics need manual input. Advogato uses manually-entered trust links. Google uses hyperlinks, which are usually manually entered.

"Low tech trust" is a pattern of manual information flow across trust links. Example: links to recommended content, represented by "hot lists" in the early days of the Web, blogrolling today. You can find a well recommended blog by starting with one you know, and clicking blogroll links. Google's PageRank calculates something similar.

Example: personal music recommendations. Automated recommendation systems (Firefly, Amazon) don't work as well as just getting recommendations from friends.

Both examples are attack-resistant, borrowing the analysis from trust metrics. In blog rolls, trust graph is explicit. In music recommendations, trust graph is implicit in flow of recommendations (they only flow between friends).

Low tech trust is good. Automated computation is harder, but opens interesting doors. Example: manual clicking on links is good at finding high quality random site, but Google is good at finding high quality site relevant to search term (and quickly, too). Killer apps for trust metrics will probably follow the same pattern.


Why am I so fascinated by trees?

One of my pie-in-the-sky projects is Athshe, a somewhat nebulous data format for trees. The writeup isn't that good, apologies. In any case, I've been kicking the ideas around in my head for quite some time now, and I still think it's worthwhile.

I'll probably write about some of the more detailed pieces, but I want to get some sense of the large scale across this time.

I think formats for tree-structured data have many intertwingled aspects. It might help to consider XML and filesystems as two special cases.

Abstract tree language

This is basically the language of what's allowed. Obviously, there are trees. The children of a node may be primarily sequential (as in XML), or primarily mapped by name (as in filesystems). What's in a leaf is also important - Unicode text in XML, arbitrary binary in filesystems.

It's good for the abstract tree language to be simple. Often, if you want more complex stuff, you find you can map it into simpler types fairly easily.

Athshe has exactly four types in the language: list, map, bytes, and tag (similar to MIME Content-Type). It's an appealing set, but I'm not sure it's quite right. In particular, I get the feeling that references (perhaps analogous to the # part of URL's) may be important too.


Once you have a tree in hand, you want to serialize it out as a file, so you can store it in a file or move it over the net as one chunk. Also, because parsing is so well understood, it's often more useful to talk about the serialization than the abstract tree language.

For a given tree language, there may be many different serializations. In the case of filesystems, there's tar, pax, zip, and many others. XML is mostly just XML, but there's stuff like wbxml too. ASN.1 is another example of an abstract tree language with many different encodings.

Access API

If you want to actually process trees, you'll need an API to access the tree data. In the case of XML, the API of choice is DOM. In the case of filesystems, it's the traditional Unix open/read/write/close API, or a variation.

The access API has many implications for performance. For one, grain size is important. For another, it's important for nodes to be very, very cheap, but active node references (open file descriptors) can be much less cheap. DOM doesn't quite get this right.

Another important API consideration is how calls map across the process boundary. For files, most calls cross the process boundary, which generally implies that files are not good for very small grains. However, note that fread() and fwrite() manage to cache a goodly fraction of file accesses within the process.

This is a good place for complexity to set in. There are all kinds of important issues in real systems only incidentally related to the abstract data language, such as asynchrony (nonblocking I/O), integrity (see fsync), access control, and so on. Applications may also want generalized undo (versioning), copy-on-write, transactions, and more.

Finally, the access API is a reasonably good place for change notification to happen. Unix has been traditionally deficient in this area; it's a standard feature in NT. In the case of XML, DOM has it, but with flaws.


It used to be enough to access a tree local to disk or memory. Now, you usually want to do it over the network too. For this, you need a protocol.

Protocols for remote access to filesystems are a very rich, fertile, and increasingly mature area. People use this stuff heavily, though not quite yet universally. Performance is important, and there are lots of clever ideas. A major performance goal is to cut down on the number of network boundary crossings.

In the XML world, the most important protocol is HTTP, which basically gives you the entire tree as one chunk. You can also run DOM through over CORBA, as described in the DOM spec, but the performace would be horrible. There's no good general way to access a large tree incrementally.

Random access file format

Filesystem trees typically don't get manipulated as tarfiles. Usually, the filesystem implementation stores the tree on disk in some random access format. Only a small part needs to be in RAM, and it's pretty quick to get to an arbitrary place in the tree. Again, this is a mature area. In the case of filesystems, it's almost a solved problem.

XML doesn't really have a random access file format. There are such things as "hierarchical databases", but they are not popular.

The main idea of Athshe is that a simple abstract data language makes the problem of building a high-performance suite of tools more tractable. The major performance theme is scalability. Operations on a small local tree should be almost as fast as a pure in-memory implementation. On the other end of the scale, it should be possible to access huge remote trees efficiently. Chunk size in your process and network boundary crossings should match the chunk size of the application; in particular, traversing a complete subtree should resemble transferring it as a serialization.

I find it very difficult to communicate why this is such a good thing. But see a recent thread on wmf's discussion page, which reminded me of my Athshe ideas.


AaronSw's ETCon schedule with weblog links is useful and interesting. One day soon, most conferences will be annotated like this, probably in realish time.

186 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!