22 May 2002 raph   » (Master)

Expensive infrastructure

I'll try my best not to make David McCusker's head explode tonight.

His IronDoc and my Athshe are both examples of "expensive infrastructure." To do a good job implementing the ideas requires a lot of complex code.

David again makes the crucial distinction between interface complexity and implementation complexity. Ideally, you have a system which is easy to use, but might do very smart, sophisticated things under the hood. In the best case, doing things well at a lower level actually makes things simpler in the higher levels.

Filesystems are the perfect example. The basic interface is pretty simple. In fact, files and directories are the basic concepts exported to the UI, even in systems designed for novice users. Yet, you have implementations such as XFS, which weighs in at 140 kloc or so.

Simpler implementations are possible, of course. In fact, the only thing that could possibly justify this kind of complexity is to be a lot better. And if you compare a modern filesystem implementation against, say, the original Minix filesystem in Linux (about 2000 lines of code), there is no contest. Performance, scalability, and robustness have all improved dramatically over the years.

Yet, that original 2 kloc implementation was important. If you had told Linus that he had to implement a highly scalable journalling filesystem before he could release the Linux kernel, the project would have been dead in the water. As the kernel became more popular, however, the payoff for more sophisticated implementations became clearer. For one, a better filesystem implementation improves matters for all of the applications (totalling many millions of lines of code, no doubt) running on Linux, without making those apps any more complex.

What is the "right level" of complexity for something like IronDoc? I'm not sure. It's not popular enough yet for the amortized payoff argument to work. This is a bit of a Catch-22; it won't become popular until it's reasonably good. The usual way around this dilemma is to build a much simpler prototype first.

I think this kind of "expensive infrastructure" is a good place for highly talented programmers to spend their time. It is a theme which runs through quite a bit of my work, including, today, Ghostscript and the continuing evolution of its graphics library.

Soap and security

Security experts such as Bruce Schneier and Alan Cox are rightfully concerned about the security vulnerabilites of new systems built on protocols such as Soap. At his ETCon talk, Schneier quipped that Soap is "firewall-friendly" in the same way that a bullet is "skull-friendly".

Dave Winer's response is pigheaded. In fact, his new "standard response" says almost nothing about security.

Dave has asserted that getting through firewalls was not an explicit goal of Soap. Could be; doesn't matter. The fact is that it will be used to get through firewalls. More precisely, firewall configurations based on Web assumptions will be used to filter traffic that breaks these assumptions. In general, you have to look pretty deeply inside a Soap message to figure out how much (if any) damage it can do.

Firewalls are only part of the problem. Expect a flowering of attacks that simply request things through Soap unanticipated by the designers of the system. With highly dynamic scripting languages, it's even harder to imagine all the ways that a clever attacker might put together the pieces made available to him; I am sure that the SOAP::Lite security hole was merely the first of its kind to come to light.

In theory, it should be possible to create secure services based on Soap. In practice, it's just not going to happen. As Schneier argued so well in his talk, increasing security is expensive. In addition to the labor and expertise needed for audits, paying attention to security slows down development, curbs features, and generally makes software less competitive in the market. The nature of the software industry guarantees that products will be developed and shipped with security holes.

I'm not saying that Soap is a bad thing. Indeed, for many applications, the advantages surely outweigh the disadvantages. Also, if you try to build those same applications with competing techniques, there will no doubt be security vulnerabilities from those, as well. What I am saying is that people should be conscious of the security problems that history and reflection tell us will occur.

I met Dave briefly at ETCon. I'm sorry we didn't have a chance to chat more.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!