Older blog entries for raph (starting at number 349)

Clear and informative error messages

For any software which is to be considered mission-critical, one of the top priorities must be to produce clear and informative error messages when something goes wrong. It might be helpful to consider this the primary goal, with production of the correct result a pleasant side effect of the special case of no errors.

Of course, as maintainer of Ghostscript, I bear a great deal of responsibility for violating this principle myself. So, at the risk of the pot calling the kettle black, I humbly present criticisms of some existing free software projects, and suggestions about how to improve matters.

My most recent bad experience with cryptic error messages was a simple permissions problem in Subversion. A log file had 644 permissions, where 664 was needed. However, the actual error report looked something like this:

svn: Couldn't find a repository
svn: No repository found in 'svn+ssh://svn.ghostscript.com/home/subversion/fitz'

Trying to track the problem down, I ran svn locally on the machine hosting the repository, resulting in this error:

svn: Couldn't open a repository.
svn: Unable to open an ra_local session to URL
svn: Unable to open repository 'file:///home/subversion/fitz'
svn: Berkeley DB error
svn: Berkeley DB error while opening environment for filesystem /home/subversion/fitz/db:
DB_RUNRECOVERY: Fatal error, run database recovery

I ended up diagnosing the problem using strace, which did print out a clear and informative error message, once I found it:

open("/home/subversion/fitz/db/log.0000000002", O_RDWR|O_CREAT|O_LARGEFILE, 0666) = -1 EACCES (Permission denied)

How did Subversion succeed in transforming such a clear error condition into such a confusing (and alarming) report? I think it's likely that the main culprit is the use of abstractions which do not support the error reporting goal as stated above. If you have a tower of abstractions, then it is essential for each abstraction in the tower to support it.

Of course, aside from Ghostscript, one of the absolute worst offenders for error reporting is the auto* toolchain. A simple problem such as a missing library often results in cryptic error messages, usually the fallout from incorrect macro substitution.

Macro substitution, while an appealingly powerful abstraction, is absolutely hopeless when it comes to mission-critical error recovery. In a typical scenario, you'd use macro expansion to rewrite your goal (create a suitable configuration file for building a program) into subgoals (such testing whether certain compiler flags work), and so on. However, when something goes unexpectedly wrong in one of the subgoal steps, it's all but impossible to trace that back up to the original goal - the only thing that remains is the expansion. Using procedures to break a goal into subgoals works in much the same way as macro expansion, but doesn't suffer from this inherent problem - when something goes wrong, the caller can look at the error returned by the callee. Of course, it's still the responsibility of the coder to actually check the return code and do something appropriate with it; all too often ignored.

chalst: is this link evidence enough of vendor participation?

see here for yesterday's entry

Life

Life is good. I just got back from a week with the family at the Quaker Yearly Meeting in San Diego, and am feeling refreshed and re-energized. The kids, in particular, had a great time running around with their PYM buddies.

Remarkable stupidity

From Dan Gillmor: Rebecca Mercuri, an extremely knowledgeable critic of electronic voting systems, was kicked out of a conference of election officials in Denver. Their excuse, that she lacked credentials (a professor at Bryn Mawr, fer cryin' out loud), would have been a lot more credible if they kicked out all the shills for the voting machine companies as well.

This kind of thing is merely illustrative of something that's gone deeply wrong with America. Money and power are what's really important in the decision-making process; truth is an annoyance that gets in the way.

High resolution displays

I've been using the term "high resolution" in talking about computer displays with, uhm, higher resolution than the 96 dpi or so that's standard on desktops these days, but I'm not happy with the term, as just about all displays are "high resolution" compared to something.

Thus, I propose the following general terms for classifying display resolutions: "dot matrix" = less than 144 dpi, "near letter quality" is 144 to 195.9 dpi, and "letter quality" is 196 to 383.9 dpi, and "Star Trek quality" is 384 dpi and above.

I've been saying for a long time that "near letter quality" and "letter quality" displays will become important. Now, I think we're really just around the corner, as these displays are becoming available in consumer-priced gadgets.

Sadly, desktop computer users are stuck with dot-matrix resolution for the near future. I did a survey of available LCD's and found that nearly all new panels are in the range of 85-100 dpi. In some ways, this is good news - lower resolution panels (such as 1024x768 17" -> 75 dpi) used to be available. However, there is little or no movement on the upper end of the range (I'm not counting specialty-priced panels such as the IBM T210, T220, and friends).

The laptop situation is a little better; resolutions on high-end models are inching up steadily, and we've just now seen near-letter-quality models (such as the Dell D800 with a 1920x1200 15.4" -> 147 dpi screen) available in the US market at commodity prices (specialty priced laptops such as the NEC Versa P700 have been available in Japan for about a year).

But where higher resolution displays have been really taking off is in smaller portable gadgets. In fact, Sony's current $100 grayscale and $180 color Palms (the SJ20 and SJ22) have 320x320 2.8" -> 160 dpi screens. In the Japanese market, we see even higher resolution devices, such as the Sony U101, with a 1024x768 7.1" -> 180 dpi screen, and the Sharp Zaurus C7xx line with 640x480 3.7" -> 216 dpi (and running a Linux kernel, no less).

There are some good reasons for the popularity of higher res screens. In many cases, the actual angular resolution of these displays is not all that much higher than desktops, because people view them at a much closer distance. Comfortable viewing distances are particuarly small in the red-hot youth market, because young people typically have much better accommodation than oldsters such as myself. Of course, the Japanese are also going to be more into small gadgets with higher resolution (as needed for adequate Kanji display) compared with their SUV-driving American counterparts.

It'll take a few years, but dot-matrix quality LCD's are going to be as obsolete as dot-matrix printers. I hope that a GNU/Linux environment will be able to use near letter quality and letter quality screens effectively, but I yet haven't seen many encouraging developments.

Fonts and hinting

What David Turner said, with a few additions.

First, I'm obviously concerned about displaying PostScript and PDF documents, for which the goals of high-fidelity, accurate rendering and high-contrast, legible text are often in tension. These document formats, for better or worse, are deeply rooted in scalable font technology. Trying to use bitmap fonts, no matter if they're pretty, is not going to work well.

Second, as the resolution of screens goes up, the tradeoff between accuracy and contrast shifts in favor of (unhinted) antialiasing. At 200 dpi, which will be standard in a few years, the contrast of unhinted aa text is plenty good enough for just about everybody. The challenge is how to get there from here. One of the obstacles is the large installed base of software which is incapable of scaling with display resolution. It's a Catch-22: there isn't the pressure to fix the broken software until the displays become cheap, and the motivation isn't there to do high volume manufacturing of the displays until there is software that works with them. Microsoft is in a position to break through that, and if they do, I'll be quite grateful.

By the way, a really good place to start would be to make double-clocked pixel rates on CRT's work. Commodity video cards typically support pixel clocks in the 360MHz range. That'll handily run 2560 x 1024 (in other words, the standard 1280 x 1024 res double-clocked in the X direction) at 95 Hz. Of course, because of the shadow mask or aperture grille, CRT's can't actually display the full resolution. However, you still get the advantages of improved contrast and glyph positioning (spacing) accuracy. It's very easy to play with this - just double all the horizontal numbers in your XFree86 modeline, then run Ghostscript with a resolution such as -r144x72 or -r192x96.

Worth reading

A conversation between Jim Gray and Dave Patterson, via Tim Bray. Linger for a while at Tim's blog; it's one of the best reads out there.

Bullshit continued

I have two quantitative questions about bullshit:

  • How does the bullshit level vary between types of communication fora?

  • How does the bullshit level vary between various topics of otherwise similar intellectual content?

I was thinking about the latter question, especially, when responding to a rant by jwz about gamma correction. Gamma is not all that complicated or difficult, but a lot of people get it wrong, a huge fraction of what you find on the Web is bullshit, and you even see your share of kooks (and see Poynton's refutation).

A quick experiment using Google searches shows that it's a lot easier to find bullshit about gamma correction than, say, the structure of rhodopsin. The query "rhodopsin structure" yielded 9 functioning links, all of which appeared to be high quality and free of bullshit. The same search for "gamma correction" yielded 7 independent links, of which one was an ad for a product, and all of the remaining 6 had problems. The first hit is typical - it suggests that the nonlinearity between voltage and luminance in CRT's is a "problem" that needs to be "corrected", rather than a sound engineering choice for video systems. Their sample images are poorly considered, and reinforce this faulty notion.

Why is gamma correction so cursed? I think the main reason is that it doesn't belong to any discipline which is taught well in school, so there isn't a core of competent, respected people who know what they're talking about. Color science in general suffers from this problem. Even though color is a very basic part of everyday life, it intersects a wide range of academic disciplines, including physics, electrical engineering (particularly video), chemistry (less so these days now that digital cameras are replacing silver), psychology, computer science, and so on.

I use gamma correction as an example of a subject which needs good bullshit discrimination. How well does the web do this? Not very, at least measured by Google. There are some good resources on gamma out there, but they don't make Google's top 10, which presumably means that it's not popular to link to them. Do blogs do a good job? That's harder to answer because my own response skews things, but my sense is no.

Of course, I am thinking about a form of communication that seems to succeed in filtering out much bullshit: peer reviewed scientific publications. There are limitations, largely those of scope; for most important things that people care about, you can't find any scientific literature on the subject. Indeed, it would be very difficult to publish a paper about gamma correction in a prestigious journal, because it's a solved problem (in fact, television engineers got it right a long time ago, and it just took computer people to screw it up). The dollar cost of producing a peer-reviewed publication is also very high, but in many cases could be considered worth it.

PDF: Unfit for Human Consumption

Of course, it's possible that one of the big reasons that Poynter's Color FAQ is not a popular link target is the fact that it's in PDF format. Jakob Nielsen, in the above linked essay, argues that PDF has very serious usability problems as a format for Web pages. It is tempting because you have far more control over the aesthetics (and it works way better for printing), but overall I have to agree with Jakob.

The good news, I think, is that many of these usability problems are not inherent to the PDF file format, but can be fixed. Indeed, many of the complaints Jakob raises have to do with the awkward integration between the PDF viewer and the Web browser. Acrobat has its own UI, but in the free software world, there isn't any viewer whose UI is similarly entrenched. It shouldn't be hard to integrate a PDF engine into a Web browser, so that you can browse fluidly between HTML and PDF formats without caring all that much which is which.

13 Jul 2003 (updated 18 Jul 2003 at 20:58 UTC) »
LTNB

Why has it been such a long time since I last wrote a diary entry? I'm not totally sure. I guess I've just been more inwardly focussed lately, especially on family issues (drop me an email if you're curious - I just don't want to write on the family's permanent Google record). But I've also been a bit of a hermit - I like it when there's

no email or phone calls.

Even so, I have stuff to write about.

Sleep apnea

I've been trying to build a home sleep study so I can determine which factors affect the seriousness (I'm especially interested in weight, even though my BMI is right in the middle of the curve). I'm about done, but it's taken more time and energy than I counted on.

Basically, the ingredients are:

  • A pulse oximeter (available from eBay for about $200-$300). I have the Ohmeda 3740, which I can recommend and seems to be very popular for sleep.

  • Strain gauge belts for measuring "respiratory effort". I have two of the Grass Telefactor 6010, at $60 each.

  • A LabJack and the EI-1040 instrumentation amplifier for getting the signals into the laptop.

Basically, you plug the stuff in to the LabJack. The Ohmeda has outputs in the right voltage range, just use 1/8" mono audio cords. The strain gauges need to be amplified - I use a gain of 1000 on the EI-1040. Since the impedance of this amplifier is so high, you'll need some resistors for the input bias current return path.

The Linux driver for the LabJack is still very alpha, so for the time being I'm just using the Windows stuff. All I need is to log the data, and the LJlogger makes a very easy-to-use ASCII file (suitable for gnuplot).

I'll probably make a Web page with a more detailed recipe and the results as I find them.

Font rendering

Really high quality text and font rendering is challenging. It's not just a question of there being a "right way" to follow; there seem to be many ways to improve font rendering. Also, what constitutes "good" text is highly subjective. I personally favor a high fidelity reproduction of print fonts, even with some loss of contrast, while others prefer their fonts highly hinted. If you're in the former camp, OS X pretty much nails it, and if you're in the latter camp, the RH desktop with Vera fonts and TT hinting enabled.

But the ultimate goal, of course, is to combine fine typography aesthetics with high contrast rendering. This is a harder problem when the text metrics have to match the source exactly (as is the case for PostScript and PDF viewing), but is still challenging even when they don't. My favorite so far is Adobe Acrobat 5, but it's still not perfect. The big problem is that spacing errors are typically in the half-pixel range, which is not really pretty (the repeated letters emphasize the spacing errors; it's not as easy to see in body text). Also, in this sample you can see that the 'm' lacks symmetry, which bothers me.

Other attempts, in my opinion, don't work as well. In particular, the screenshots I've seen of Longhorn suggest that it'll distort the stroke weights to integers, but still suffer some loss of contrast in the case of subpixel positioning. Of course, they've still got some time to improve it before they ship.

Longhorn may have an even more significant consequence for us: it promises to support very high resolution displays. So far, there's been a bit of a catch-22 situation. High resolution displays are available but expensive, so very few have been shipped, and almost no software supports them, so there isn't the motivation to figure out how to manufacture and sell them cheaply. But if Microsoft puts their weight behind them, it could easily break this cycle.

I haven't seen any of the technology involved, but I'll take a guess. Since high resolution displays are around 200dpi, it makes sense for non hires-aware apps that do bitmap drawing to just double the pixels. In most cases, text should be able to go at full res without any software changes - the requirements are rather similar to simple low-res antialiased text.

So this is what I think they'll do. The default graphics context will be set up to double all coordinates before drawing, and zoom bitmaps accordingly. Apps that expect to draw to a 96 dpi screen will look about the same as they do now. Then, there'll be a call to get a hi-res graphics context if available, with 1:1 pixel drawing, and correspondingly higher precision for positioning glyphs. It'll be important to take this path for the Web browser, the word processor, and graphics software, but for a lot of other stuff it won't be as important.

The consequences could be dramatic. For one, if Apple isn't working on something similar, they'll face a mass defection of graphic arts types to the MS platform - once high res displays are affordable and really work, people will not want to go back. It'll be like trying to sell a black-and-white only lineup when the rest of the world is moving to color.

Second, I'd expect high res displays to come down to commodity pricing. I'm not an expert on the economics of displays, but I'd expect that the actual cost of manufacturing a high res LCD isn't much higher than a low res. I think that's much less true for CRT's, but they're on their way out anyway.

If the Linux desktop folk have any real vision, they'll start working on support for high res displays now. A lot of what I'm talking about, with the 2x coords and bitmap zooming, could be done at the X level so that legacy apps would work without ridiculous tininess. Then, modern GUI toolkits could grab the higher resolution context and get crisp, accurately positioned text. I'm not holding my breath, though.

If high res displays become widespread, then the need for high-tech font hinting basically goes away, in much the same way that the need for fancy dithering algorithms went away when video cards went from 8 bits to 24. So, long term, I'm not sure it makes sense to invest a lot of work into hinting.

Bullshit

chalst: Your link to the essay on bullshit is most excellent. I've been toying with the idea of doing a blog essay on the theme of lies and lying, but I now see that "bullshit" is the superior concept. It applies so well to so many things I've been thinking about recently: the "evidence" justifying the war against Iraq, the SCO lawsuit, the way Time magazine flogs TW/AOL product on their front cover, and countless more. It's all bullshit.

Now here's a question: is the blog form more or less prone to bullshit than mainstream media? I'd say there's a lot more diversity in blogs, so if you're seeking to cut through the bullshit, you have a much greater chance of success in blogs. But blogs also seem to be pretty good at spreading bullshit.

Hopefully, in this age of high-tech communications gear, the study of bullshit can become both quantitative and prescriptive - to design, and choose, communications fora with the explicit goal of minimizing it. Sign me up!

P2P-Econ Workshop

I spent the day, and will spend tomorrow, at the Peer-to-peer Economics Workshop at UC Berkeley. Most importantly, I'm getting to meet friends such as Bram and Roger, as well as new people such as Tim Moreton and Andrew Twigg, who are taking my stamp-trading ideas forward into new and interesting territory.

I've always been skeptical of economics as an intellectual discipline. It was originally called "the dismal science" because of the pessimism of some of its predictions, but I'm sure the name has stuck because of the quality of the science. Indeed, some of the presentations fit the stereotype perfectly: taking simple questions and muddying them all up with hokey math and fancy sounding theories.

However, the presentation by Hal Varian, was a wonder to behold. In his hands, economics feels much less like a science than an art - an art of explanation. He presented a very simple model of how sharing (including both traditional forms such as libraries and new forms such as P2P networks) affects both the pricing of media works. In fact, it was a dramatically oversimplified model, but that was ok. The simplicity of the model means that it's easy to understand, but it still captures something about the real world. Impressive.

There's good news and bad news on "reputation systems". The good news is that academics are starting to study the topic seriously. The bad news is that the system they're studying is eBay. There's nothing especially wrong with eBay, but I still find it sad to see directions of academic study driven so obviously by commercial popularity.

Regarding trust metrics, it's obvious that a lot of people who should know about them, don't. That's easy to fix, though, and indeed a workshop such as this one is one of the best ways to do so. And, of course, it reminds me that I really need to write up my ideas in academically citable form, for which finishing my thesis will do!

Long time no blog, again.

Ghostscript

We have new releases out, both GPL and AFPL. I recommend upgrading, if only to get the new security updates.

The 8.10 release has some seriously enhanced font rendering, thanks to Igor Melichev. Try experimenting with -dAlignToPixels=0, which enables subpixel rendering. I think we'll make this the default in a future release.

I spent two days last week at a customer site. I'm glad we have both paying customers and free users - addressing immediate needs is fun, and the commercial world is often better at expressing appreciation for work than the world of free software.

You probably saw that Ghostscript is leaving the GNU umbrella. This has been brewing for some time, but only last week got publicity. I think we weathered it pretty well, all things told.

Responses to threads on Advogato

nymia: It's interesting that you say DOM is the future, because I was very much of the same opinion four years ago. Then I tried to actually implement stuff with it, and became disillusioned.

The ideas behind DOM are sound, but the spec itself has a lot of bad engineering. Among the highlights: It's virtually impossible to implement DOM in a memory-efficient way. Character encoding is forced to be UTF-16 (a Java-ism). It's difficult to do DOM memory management without garbage collection. The event propagation model is broken, and doesn't support multiple Views in a Model-View pattern.

I hope somebody engineers a better DOM-like tree access API, and that's the future.

yeupou and dobey: yes, I'm also unhappy with the state of viewers based on Ghostscript. They don't have to suck, but they seem to.

At this point, I think the best bet may be GSView, which is scheduled to be released under GPL around the same time 8.0 is, which is November. There's a good chance that it will become the viewer of choice.

Meanwhile, I note that xpdf 2.0 is out. The good news is that they've replaced their hand-rolled GUI toolkit with a real one. The bad news is that it's Motif. I've been meaning to contact Derek for a while - perhaps I will soon.

Worth reading

Here are some interesting things I've read recently:

Retroactive Moral Conundrum, a piece by Tim Bray on the Iraq war. I agree with his main point - while getting Saddam out was a good thing, having our administration lie about things constantly, and having nobody really call them on it, is a bad thing. Take the time to browse the rest of Tim's blog while you're there - it's now one of my favorites.

Philip Greenspun on Israel. I'm not sure I agree with what Philip says, but it's very thought-provoking.

Sleep

I just had my second sleep study this morning, and this time they did find some apnea. Next, we'll see what to do about it.

There are a few factors which have changed since the last time, one of which is that I'm 160 lbs now, about 20 more than when I had the first study done. I'm going to experiment with losing weight again, and this time hopefully I'll be able to track any improvements. So, again, I'm interested in putting together a home sleep study. The sound is fairly easy, but it's doesn't give a clear indication of breath cessation. I think the least invasive technique for measuring that is a flow meter, but real sleep studies also add an EEG and a pulse oximeter.

Again, if anyone knows a good, inexpensive source for this equipment, or has experience doing something similar, suggestions are greatly appreciated.

Xr, fonts

I talked in depth with Keith Packard a few days ago. We spent a fair amount of time talking about font rendering, and also about the Xr project. Xr is interesting - it overlaps in goals somewhat with Fitz, but with a primary focus on interactive display applications.

Among other things Xr gets right is that it's cross platform - earlier Xrender work seemed much more Unix-specific.

I think the question of how to do high quality text rendering on the desktop is still open. One local maximum is the current performance of Xft with the Vera fonts and with TrueType hinting enabled (screenshot). This configuration succeeds in rendering high-contrast, fairly visually uniform text (stroke weight and spacing are quite uniform, but curves and diagonals are softer than vertical and horizontal lines). I personally find the 1-pixel stems to be a bit light, especially on my 132 dpi LCD screen, and in general prefer text that looks a little more like original, unhinted fonts. In particular, I like different sizes of the same font to be roughly consistent in darkness. With Vera, medium-big print is much, much lighter than small (as soon as the stroke weight goes to 2px, then it's darker again).

This kind of rendering is perfectly reasonable for GUI elements and HTML rendering, but not for WYSIWYG viewing (such as PostScript and PDF). For this, I think the tradeoffs shift a bit. Even aside from matters of personal preference, to ensure even spacing you need subpixel positioning. That, in turn, basically forces lower stroke contrast (although not necessarily as soft as completely unhinted rendering, such as OS X). The Xft API doesn't support subpixel positioning, but no doubt a more sophisticated text API for XRender will.

In any case, with all this playing with fonts and font renderers, I've rekindled my own font, LeBe, a revival of a font from 1592. Some of the glyphs still need work, but overall I'm pleased with the way it's going.

Formalism

Formal proofs don't mean that mathematics is reduced to no more than the manipulation of strings. A proof reflects the personal style of the person devising it, whether or not the individual proof steps can be formally checked. The need to recount the proof steps in detail is a constraint, just as the moves of chess or Go are a constraint.

Mathematics seems to get along pretty well without strict adherence to formalism, but I think that using mathematical techniques for computer programming is a different story. The mathematical content of most relevant theorems is mind-numbingly tedious, so I think you need a computer to check them, and probably to help generate them, for realistic programs.

Languages

I have a confession to make. I've been designing programming languages since I was nine years old (the very first was syntactic sugar for 8085 assembler). Most of them have never been observed in the wild, but one seems to have escaped and taken on a life of its own. Now, I find that there's an implementation of "99 bottles of beer" in it, as well as an interpreter written in OCaml.

I am no longer excited by Io; continuations seemed a lot more interesting when I was a teenager than now, although it is of course useful to know how to program with inverted flow of control. It is, perhaps, a useful illustration that continuations, like a number of other language primitives, are powerful enough that you can build an entire programming language from them and nothing else.

Also, I enjoyed chromatic's essay What I Hate About Your Programming Language. I tried responding on the comment page, but it wouldn't take my login. I'm sure the login problem is due to the website being written in some dynamic language that encourages messing with stuff, rather than good ol' C, which requires you to think through what you want the code to do first.

Seriously, while chromatic's mention of mod_virgule as a website written in C is gratifying, if I had to do it from scratch, I probably would use Python. I like programming in C (especially deep algorithms), and I like programming in Python (especially prototypes, and gluing things together).

340 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!