Older blog entries for raph (starting at number 355)

Arnold

We, the voters of California, have apparently just lost our friggin' collective minds. I just hope Arnold doesn't do too much damage.

The Great Unraveling

I just read Paul Krugman's book, "The Great Unraveling". It's a great, great book. I'm especially impressed at the way Krugman was able to gain so much insight about the Bush administration simply by looking critically at their economic policy. Highly recommended.

Chris Lydon

Also highly recommended: Chris Lydon's series of audio interviews posted to the Net. These are a great demonstration of how the Net can deliver content completely beyond the reach of the corporate media. I'll be listening, and hope to see more things like it.

QXGA

170 dpi laptop displays haven't reached commodity status, but they're getting closer. Eurocom has a "mobile workstation" with one, priced $600 above the next lowest res, which is actually fairly affordably priced.

My guess is that we'll see more of these. It will be very hard to pass one up when it comes time to replace my laptop.

The Final Solution

You might be an anti-spam kook if you have discoverd the final, ultimate solution to the spam problem (FUSSP). I scored shockingly high on the test. Of course, I realize that using a trust metric to defeat spam, while probably effective, won't be easy.

Electronic voting

Something is seriously rotten in the land of electronic voting. Consider:

  • Rebecca Mercuri was thrown out of a meeting of the IACREOT (International Association of Clerks, Records, Election Officials, and Treasurers) a couple of months ago for voicing criticism of the electronic voting machines being sold.
  • A group of researchers pulished a searing criticism of Diebold's touchscreen voting machines. These machines are a total joke in terms of security - they're based on Microsoft Access, so everything, even the audit logs, can easily be tampered with. Further, their use of crypto is spotty and contains amateurish mistakes such as reusing IV's in CBC mode. Diebold's response is lame, simply ignoring many of the points scored in the original paper.
  • The State of Maryland, on the verge of buying lots of Diebold machines, commissioned an "independent" study of the machines from SAIC (another cog in the military-industrial machine), which identified "several high-risk vulnerabilities" and concludes that the system is not compliant with Maryland's standards. The somewhat unbelievable response from the president of Diebold is: "The thorough system assessment conducted by SAIC verifies that the Diebold voting station provides an unprecedented level of election security."
  • The chief executive of Diebold is also working for the Bush campaign, and, in a recent fund raising letter, wrote that he is "committed to helping Ohio deliver its electoral votes to the president next year."
  • Even though Diebold is emerging as the owner of the most smoking gun, the other election machine vendors aren't coming across as being much better.
  • Diebold successfully takes down blackboxvoting.org by issuing a DMCA notice to their ISP, based solely on links posted at the site.
  • Leaked memos clearly indicate that Diebold routinely violates election guidelines, among other things by using versions of their software other than those certified.
  • In spite of all this, the state of Maryland is going forward with the Diebold contract.

This is a big story, I think. Even the mainstream press is starting to cover it. If there are any people reading this in Maryland who are good smart cards, just put in a million votes for the Green candidates. That ought to wake up the powers that be, and maybe the winner can do some good in the meantime.

It's also clear that we can do some good by raising a stink. The IEEE was all set to approve incredibly weak standards for electronic voting machines, but in response to the EFF's action alert, they actually sent it back to the drawing board.

Diary backlog

It's been a while since my last entry, and there are a lot of things I want to write about. Fortunately, most of them will keep.

Much of the past two weeks was taken up with our Ghostscript staff meeting, our exhibit at the Seybold show, and the various preparation and followup.

One of the nicest things about the time was to have fellow hackers stay with us, first Ralph Giles, then Tor Andersson. The kids got to spend time with both, and I think it was very enriching for them, and I hope enjoyable for our guests.

In fact, this was the first time meeting Tor in person. I'm really enjoying working with him - it's clear we have many goals in common. We had a few days of very intense discussion, covering the good and bad in TeX, the Fitz design (of course), and many meditations on the way software should be built.

Silliness

I've enjoyed drinking from a Sun Java mug since I first became its proud owner, back when Sun was making the original push. A few days ago, Sun announced the "Java Desktop", with all kinds of neat indemnification for its buyers. A few days ago, the handle broke off of my Java mug. I have never dropped it and didn't abuse it in any way, just mostly used it for nuking water for tea. Is there a causal relation between these two events?

I come up with all kinds of funny things in my dreams, but rarely such a good pun as this one: it's clear that discount merchandising needs its own XML language for describing the various flavors of clearance sales, everyday discounts, and so on. Thus, I propose "Markdown Markup language". It's a shame I didn't hook up with Tim Bray at Seybold; we were both there, and I'm sure he would have gotten a chuckle out of it.

Two notes on performant systems

It used to be that performance was a central component of just about every computer project. It had to be; computer time cost so much that wasting it was a real problem. These days, it is so cheap that we tend to be focussed on how to waste it most effectively - should all that power go into interpreter cycles in a high level language (such as Python), into building more robust abstractions for storage (such as SQL databases), or somewhere else entirely? The tradeoff is often stated as computer cycles (cheap) vs programmer cycles (expensive), but I'm not sure that's it entirely; the latter can always be outsourced to China.

So two notes on projects emphasizing performance caught my attention the last couple of days. The first is a criticism of Subversion by Tom Lord, who happens to be the author of Arch, a competing version control system (thanks to jdub for the link). The other is a post by Tim Bray writing about his recent experience writing a performance critical module in C. Both writers have lots of insight and experience, and are worth listening to.

A common element of both posts is how you have to take care to preserve performance in the currently trendy XML world. But one of the interesting things about Tim Bray's experience is that he was still able to get the performance he needed, and not too painfully either.

One of the nice things about XML is that it doesn't force you to work at a high level of abstraction. The spec itself is essentially a bridge between a low-level representation (a sequence of textual bytes in a fairly simple grammar) and a higher-level abstraction (trees of textlike thingies). By contrast, DOM is an abstraction that basically forces you to work at the higher level, with essentially a 1-1 mapping between the things in the abstraction (nodes and so on) and the objects that represent them. If Tim were forced to do his project in DOM, rather than having the choice of using low-level XML tools such as expat and his hand-rolled finite state machine, performance would have suffered unbearably.

The use of XML gives runtime compatibility with tools designed at a higher level of abstraction. In particular, I'm sure Tim could easily describe what his C program does in terms of XML nodes, etc. This used to be central to the craft of programming: take a description of the desired task, and create an efficient implementation of that task. These days, the world is more complex, so trying to figure out what the desired task is takes most of the programmer's time, and, as for efficiency, we can let Moore's Law take care of that.

Designers of abstractions should take this lesson from XML. Being a bridge between two different levels of abstraction is a good thing. A number of my favorite things have that flavor: the Unix process abstraction, the Python/C runtime embedding. Compare the latter with the JVM, which basically forces you to do everything at the higher level of abstraction of the virtual machine.

Lego Bionicle game

The kids stumbled across the Lego Mata Nui Online Game II. It's interesting because it's one of the more intensive uses of 2D graphics I've seen in a while. However, the implementation leaves something to be desired. It's too slow to be playable on the kids' 500MHz iMac, but fine on my dual 1.6GHz Athlon box. Even so, the implementation (in Flash) is a bit flaky.

The storyline is very much reminiscent of those Infocom text-adventure games of the '80s (and of course, today's retro revival), but with prettier graphics, orders of magnitude more computing requirement, and lots more crashing. Other than that, I'm not very knowledgeable about the Myst family of games, but it's probably a direct rip-off^Wtribute.

Handhelds

I've been looking at handheld devices, and have gotten a Sony Clie SJ20 ($100, 160dpi grayscale screen, 33MHz 68000) to play with. This class of machine is just a little too puny to take seriously; int's are 16 bits, you start running into various 64k limitations when you do anything real, and there's no real libc. However, the next generation of Palms is starting to look interesting indeed - these tend to have reasonably fast ARM chips, and the pricing is moving down, likely squeezing out the 68000-based models pretty soon.

High-end Palm 5 devices such as the Sony UX50 are even more interesting, not least because they now have WiFi networking built-in. But the big story for me is the display. The resolution is going up, and they're getting nicer in other ways as well.

Maybe someday, we'll all be running a free environment such as GPE on our handhelds, but in the meantime there's a demand for apps on PalmOS.

I haven't really looked into the build system for PalmOS 5, but it seems a little daunting. In particular, the free tools seem to be lagging the official ones (Mac and Windows only). In an ideal world, building would "just work", but we're certainly not there yet.

Pattern analysis

Here's a little amateur "pattern analysis" of my own. On one side, this quote from a c't interview of SCO's Chris Sontag (translated from German original by "Apogee"):

c't: You are acting fairly belligerent on this forum. You declared war against open source, since it becomes destructive for the software industry. Does the whole movement have to die so that a few software companies can live well?

McBride: Actually, that was more aimed at the GPL, not open source as a whole. There's a lot of very valuable effort in open source. But the extreme interpretation that nobody himself owns anything that he developed himself, that can't remain like this. With this, created value gets destroyed. The GPL must change or it will not survive in the long run. I have discussed with many exponents of the open source side about this already.

On the other side, Bill Gates, in the keynote address of the Microsoft Government Leaders Conference in Seattle, April 2002:

"Then you get to the issue of who is going to be the most innovative. You know, will it be capitalism, or will it be just people working at night? There's always been a free software world. And you should understand Microsoft thinks free software is a great thing. Software written in universities should be free software. But it shouldn't be GPL software. GPL software is like this thing called Linux, where you can never commercialize anything around it; that is, it always has to be free. And, you know, that's just a philosophy. Some said philosophy wasn't around much anymore, but it's still there. And so that's where we part company."

I'm not much of a conspiracy theorist myself, so I'll leave it to the rest of our crack IBM-funded* team of rocket scientists to run the spectral recognition and see what comes out.

* Full disclosure: in point of fact, IBM is a significant customer of my employer. Irony abounds, no?

Anger

I'm find Steven Rainwater's suggestion of an organized counterattack against SCO appealing, but ultimately I think our interests are best served by acting honorably and being careful to tell the truth. That way, the contrast between our approach and SCO's should be most apparent, even to lawyers and judges.

Even so, I want to acknowledge the incredible feeling of anger that is rising in me. It is incredibly unfair that a bunch of opportunists can hire a bunch of unethical* lawyers, stir up a tremendous amount of press for their increasingly outrageous lies, and ultimately profiting through insider stock trades, while those of us who actually create value by making software have to struggle.

Of course, I realize that the world is under no obligation to be fair, but the very institutions which claim to uphold justice and fairness, namely the courts and the press, seem compromised. I'm quite sure that when Judge Kimball finally rules on the case, he will not be kind to SCO. But this could take years. In the mean time, the press and the stock players continue to take SCO quite seriously, just on the basis of having filed a multibillion dollar lawsuit.

In the long term, this case could be very good for Linux and free software in general. It is very clearly a case of good guys vs. bad guys. The SCO execs and lawyers are, in fact, playing the latter role quite admirably. As such, I think the story has a much better chance of playing to the public than a dry philosophical debate over copyright, the public domain, and the public interst. It also has a much better chance of playing to the public than a venture capitalist-fueled hype wave, which, keep in mind, is the last taste the mass public has gotten of the Linux story.

So I think there is something we can do. Most newspapers at least pay lipservice to factual accuracy. Adopt your local paper and hold them to it, at least for articles on the SCO case (of course, no harm is done if this effort spills over into other aspects of free software). When doing so, be very professional. Don't fight FUD with counter-FUD. Concentrate on clearly verifiable inaccuracies, and provide journalist-friendly support for all your claims.

* Among the ethical lapses of Boies, Schiller, and Flexner in the SCO case, the clearest and most egregious are making of frivolous claims, and being a party to all the lying. The firm is not new to ethical controversy, including misrepresentations in the 2000 election case. Indeed, a Bar grievance committee in Tallahassee recently found probable cause that Boies had violated rules against misconduct. I sincerely hope their role in the SCO mess does not go uninvestigated.

8.11

Ralph did most of the actual work getting Ghostscript 8.11 out. It looks like a really good release. I'm pleased.

Email is pretty useless

Our email server has been completely inundated with the latest worm. As a result, email has been pretty much non-functional, and I've had to put in way too much time to nurse it along.

The entire email infrastructure is really decrepit. It's a classic tragedy of the commons situation - nobody is really responsible for keeping it healthy. I'm also not at all impressed by the technical performance of the sendmail + mailman + procmail + spamassassin combo I'm using. It just melts down when you start throwing real load at it. For one, the default configuration of mailman is qrunner (which runs once a minute from a cron job) to process no more than 300 messages. So, if you're getting more than 5 spams a second, prepare to watch the disk fill up. Fix that (set QRUNNER_MAX_MESSAGES higher in /var/lib/mailman/Mailman/mm_cfg.py), and watch the system reach its fork limit. This is really bad engineering.

Between this and the spam problem, email is looking like it has a really poor cost/performance ratio. It's no wonder people are flocking to alternatives wherever possible. A big part of the reason casper is hit so heavily is that it hosts the mailing lists where we do most of our work. I guess we're going to need to start looking carefully at migrating all that to web-boards, for all the disadvantages those entail.

I don't see any encouraging signs that suggest that email is going to get fixed any time soon (after all, this is version F of the worm, which means we've had the benefit of A through E as training exercises). That's sad, but it also means there's a tremendous opportunity for somebody to create something better. It will come as no surprise to my readers that I have some ideas how to do this. I feel tempted to write a more detailed essay, but don't really have the time right now.

Honor at low levels

Thanks to Jesper Louis Anderson for pointing me to Judy. It is damn interesting. There are some good lessons in the code. I've been discussing it a bit with Tor, because there are some parallels with the way I want to lay out Fitz trees in memory.

No honor at all

I'm astonished at the gall of the theiving liars that call themselves SCO. I'm even more astonished at the fact that their stock price seems to be doing fairly well, in spite of all the insider trading and stock manipulation that's going on. The fact that these parasites are making serious money rather than going to jail gives me despair about the way society is run these days.

LWN has had incredible coverage, but in general the mainstream media shows themselves to be utterly useless. By giving these clowns a forum to make their outrageous claims, they're just giving them credebility in the eyes of the sheep readership (which, no doubt, overlaps considerably with the poor shnooks who are taking long positions in the stock).

Clear and informative error messages

For any software which is to be considered mission-critical, one of the top priorities must be to produce clear and informative error messages when something goes wrong. It might be helpful to consider this the primary goal, with production of the correct result a pleasant side effect of the special case of no errors.

Of course, as maintainer of Ghostscript, I bear a great deal of responsibility for violating this principle myself. So, at the risk of the pot calling the kettle black, I humbly present criticisms of some existing free software projects, and suggestions about how to improve matters.

My most recent bad experience with cryptic error messages was a simple permissions problem in Subversion. A log file had 644 permissions, where 664 was needed. However, the actual error report looked something like this:

svn: Couldn't find a repository
svn: No repository found in 'svn+ssh://svn.ghostscript.com/home/subversion/fitz'

Trying to track the problem down, I ran svn locally on the machine hosting the repository, resulting in this error:

svn: Couldn't open a repository.
svn: Unable to open an ra_local session to URL
svn: Unable to open repository 'file:///home/subversion/fitz'
svn: Berkeley DB error
svn: Berkeley DB error while opening environment for filesystem /home/subversion/fitz/db:
DB_RUNRECOVERY: Fatal error, run database recovery

I ended up diagnosing the problem using strace, which did print out a clear and informative error message, once I found it:

open("/home/subversion/fitz/db/log.0000000002", O_RDWR|O_CREAT|O_LARGEFILE, 0666) = -1 EACCES (Permission denied)

How did Subversion succeed in transforming such a clear error condition into such a confusing (and alarming) report? I think it's likely that the main culprit is the use of abstractions which do not support the error reporting goal as stated above. If you have a tower of abstractions, then it is essential for each abstraction in the tower to support it.

Of course, aside from Ghostscript, one of the absolute worst offenders for error reporting is the auto* toolchain. A simple problem such as a missing library often results in cryptic error messages, usually the fallout from incorrect macro substitution.

Macro substitution, while an appealingly powerful abstraction, is absolutely hopeless when it comes to mission-critical error recovery. In a typical scenario, you'd use macro expansion to rewrite your goal (create a suitable configuration file for building a program) into subgoals (such testing whether certain compiler flags work), and so on. However, when something goes unexpectedly wrong in one of the subgoal steps, it's all but impossible to trace that back up to the original goal - the only thing that remains is the expansion. Using procedures to break a goal into subgoals works in much the same way as macro expansion, but doesn't suffer from this inherent problem - when something goes wrong, the caller can look at the error returned by the callee. Of course, it's still the responsibility of the coder to actually check the return code and do something appropriate with it; all too often ignored.

chalst: is this link evidence enough of vendor participation?

see here for yesterday's entry

Life

Life is good. I just got back from a week with the family at the Quaker Yearly Meeting in San Diego, and am feeling refreshed and re-energized. The kids, in particular, had a great time running around with their PYM buddies.

Remarkable stupidity

From Dan Gillmor: Rebecca Mercuri, an extremely knowledgeable critic of electronic voting systems, was kicked out of a conference of election officials in Denver. Their excuse, that she lacked credentials (a professor at Bryn Mawr, fer cryin' out loud), would have been a lot more credible if they kicked out all the shills for the voting machine companies as well.

This kind of thing is merely illustrative of something that's gone deeply wrong with America. Money and power are what's really important in the decision-making process; truth is an annoyance that gets in the way.

High resolution displays

I've been using the term "high resolution" in talking about computer displays with, uhm, higher resolution than the 96 dpi or so that's standard on desktops these days, but I'm not happy with the term, as just about all displays are "high resolution" compared to something.

Thus, I propose the following general terms for classifying display resolutions: "dot matrix" = less than 144 dpi, "near letter quality" is 144 to 195.9 dpi, and "letter quality" is 196 to 383.9 dpi, and "Star Trek quality" is 384 dpi and above.

I've been saying for a long time that "near letter quality" and "letter quality" displays will become important. Now, I think we're really just around the corner, as these displays are becoming available in consumer-priced gadgets.

Sadly, desktop computer users are stuck with dot-matrix resolution for the near future. I did a survey of available LCD's and found that nearly all new panels are in the range of 85-100 dpi. In some ways, this is good news - lower resolution panels (such as 1024x768 17" -> 75 dpi) used to be available. However, there is little or no movement on the upper end of the range (I'm not counting specialty-priced panels such as the IBM T210, T220, and friends).

The laptop situation is a little better; resolutions on high-end models are inching up steadily, and we've just now seen near-letter-quality models (such as the Dell D800 with a 1920x1200 15.4" -> 147 dpi screen) available in the US market at commodity prices (specialty priced laptops such as the NEC Versa P700 have been available in Japan for about a year).

But where higher resolution displays have been really taking off is in smaller portable gadgets. In fact, Sony's current $100 grayscale and $180 color Palms (the SJ20 and SJ22) have 320x320 2.8" -> 160 dpi screens. In the Japanese market, we see even higher resolution devices, such as the Sony U101, with a 1024x768 7.1" -> 180 dpi screen, and the Sharp Zaurus C7xx line with 640x480 3.7" -> 216 dpi (and running a Linux kernel, no less).

There are some good reasons for the popularity of higher res screens. In many cases, the actual angular resolution of these displays is not all that much higher than desktops, because people view them at a much closer distance. Comfortable viewing distances are particuarly small in the red-hot youth market, because young people typically have much better accommodation than oldsters such as myself. Of course, the Japanese are also going to be more into small gadgets with higher resolution (as needed for adequate Kanji display) compared with their SUV-driving American counterparts.

It'll take a few years, but dot-matrix quality LCD's are going to be as obsolete as dot-matrix printers. I hope that a GNU/Linux environment will be able to use near letter quality and letter quality screens effectively, but I yet haven't seen many encouraging developments.

Fonts and hinting

What David Turner said, with a few additions.

First, I'm obviously concerned about displaying PostScript and PDF documents, for which the goals of high-fidelity, accurate rendering and high-contrast, legible text are often in tension. These document formats, for better or worse, are deeply rooted in scalable font technology. Trying to use bitmap fonts, no matter if they're pretty, is not going to work well.

Second, as the resolution of screens goes up, the tradeoff between accuracy and contrast shifts in favor of (unhinted) antialiasing. At 200 dpi, which will be standard in a few years, the contrast of unhinted aa text is plenty good enough for just about everybody. The challenge is how to get there from here. One of the obstacles is the large installed base of software which is incapable of scaling with display resolution. It's a Catch-22: there isn't the pressure to fix the broken software until the displays become cheap, and the motivation isn't there to do high volume manufacturing of the displays until there is software that works with them. Microsoft is in a position to break through that, and if they do, I'll be quite grateful.

By the way, a really good place to start would be to make double-clocked pixel rates on CRT's work. Commodity video cards typically support pixel clocks in the 360MHz range. That'll handily run 2560 x 1024 (in other words, the standard 1280 x 1024 res double-clocked in the X direction) at 95 Hz. Of course, because of the shadow mask or aperture grille, CRT's can't actually display the full resolution. However, you still get the advantages of improved contrast and glyph positioning (spacing) accuracy. It's very easy to play with this - just double all the horizontal numbers in your XFree86 modeline, then run Ghostscript with a resolution such as -r144x72 or -r192x96.

Worth reading

A conversation between Jim Gray and Dave Patterson, via Tim Bray. Linger for a while at Tim's blog; it's one of the best reads out there.

Bullshit continued

I have two quantitative questions about bullshit:

  • How does the bullshit level vary between types of communication fora?

  • How does the bullshit level vary between various topics of otherwise similar intellectual content?

I was thinking about the latter question, especially, when responding to a rant by jwz about gamma correction. Gamma is not all that complicated or difficult, but a lot of people get it wrong, a huge fraction of what you find on the Web is bullshit, and you even see your share of kooks (and see Poynton's refutation).

A quick experiment using Google searches shows that it's a lot easier to find bullshit about gamma correction than, say, the structure of rhodopsin. The query "rhodopsin structure" yielded 9 functioning links, all of which appeared to be high quality and free of bullshit. The same search for "gamma correction" yielded 7 independent links, of which one was an ad for a product, and all of the remaining 6 had problems. The first hit is typical - it suggests that the nonlinearity between voltage and luminance in CRT's is a "problem" that needs to be "corrected", rather than a sound engineering choice for video systems. Their sample images are poorly considered, and reinforce this faulty notion.

Why is gamma correction so cursed? I think the main reason is that it doesn't belong to any discipline which is taught well in school, so there isn't a core of competent, respected people who know what they're talking about. Color science in general suffers from this problem. Even though color is a very basic part of everyday life, it intersects a wide range of academic disciplines, including physics, electrical engineering (particularly video), chemistry (less so these days now that digital cameras are replacing silver), psychology, computer science, and so on.

I use gamma correction as an example of a subject which needs good bullshit discrimination. How well does the web do this? Not very, at least measured by Google. There are some good resources on gamma out there, but they don't make Google's top 10, which presumably means that it's not popular to link to them. Do blogs do a good job? That's harder to answer because my own response skews things, but my sense is no.

Of course, I am thinking about a form of communication that seems to succeed in filtering out much bullshit: peer reviewed scientific publications. There are limitations, largely those of scope; for most important things that people care about, you can't find any scientific literature on the subject. Indeed, it would be very difficult to publish a paper about gamma correction in a prestigious journal, because it's a solved problem (in fact, television engineers got it right a long time ago, and it just took computer people to screw it up). The dollar cost of producing a peer-reviewed publication is also very high, but in many cases could be considered worth it.

PDF: Unfit for Human Consumption

Of course, it's possible that one of the big reasons that Poynter's Color FAQ is not a popular link target is the fact that it's in PDF format. Jakob Nielsen, in the above linked essay, argues that PDF has very serious usability problems as a format for Web pages. It is tempting because you have far more control over the aesthetics (and it works way better for printing), but overall I have to agree with Jakob.

The good news, I think, is that many of these usability problems are not inherent to the PDF file format, but can be fixed. Indeed, many of the complaints Jakob raises have to do with the awkward integration between the PDF viewer and the Web browser. Acrobat has its own UI, but in the free software world, there isn't any viewer whose UI is similarly entrenched. It shouldn't be hard to integrate a PDF engine into a Web browser, so that you can browse fluidly between HTML and PDF formats without caring all that much which is which.

346 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!