Older blog entries for raph (starting at number 412)

Lots of stuff

Hi all! I've been fairly inwardly focussed for the past few months, but there's a lot of stuff happening now, and I'm feeling more like reaching out to the world. Usually this time of year I start feeling like I want to hibernate, what with the evenings getting dark and the rain beginning, but this time I seem to have even more energy than usual.

A tough logic puzzle

Do you like difficult puzzles? Wanna show off your brilliance to the rest of the world and make a little money to boot? Take a look at Ghostscript bug 688990. I spent more than a week trying to reverse engineer the imagemask interpolation algorithm used by Adobe PostScript, based on the original Mac implementation from twenty or so years ago, but was only able to come up with an approximate answer.

Feel free to post comments, questions, or requests for more test images to the bug itself. The "bountiable" keyword means that the solution (hopefully implemented as code) gets a check for, I think, $500.


My font releases are moving forward. Inconsolata, in particular, is just about done, and that's now released under the new SIL Open Font License. There are a few other goodies posted on my font pages, for people who haven't seen them in a while.


I met Nathan Hurst about six years ago when I gave a talk at linux.conf.au. We chatted about Libart, then pretty much went our separate ways since then.

Libart, as you'll recall, was the graphics library behind Gill, which begat Sodipodi. Sodipodi, in turn, begat Inkscape, which is starting to draw a lot of attention and users. In any case, Inkscape now uses Cairo for the rendering, but the vector-based geometry operations are still somewhat messy and ad-hoc, so Nathan and others have founded the lib2geom project to address those needs.

As it turns out, I have both interest in and need of these kinds of basic computational geometry primitives for my font work, especially stroke offset, intersection (for making nice clean outlines), and conversion to optimized Beziers. I have various prototypes written in Python and so on, and have sent those to Nathan.

With luck, all of this stuff will come together as efficient, robust C++ code, and then my dream of having a good implementation of next-generation font tools will be that much closer. I'm also hopeful that, by joining forces with Nathan and others on the lib2geom project, Inkscape and other vector-based free software projects can benefit.


It looks like the new spam filter here is working swimmingly. I've long felt that the trust metric ideas were sound, but that they needed more time and energy on their implementation than they were getting. Looks like Steve is doing a great job on that, and I hope that the success here inspires other people as well.

One project people might want to take a look at is the Bitchun Society, by Joseph Petviashvili. It basically implements a similar eigenvalue trust metric as the diary rankings here, but as a Jabber bot. I don't really know whether this particular implementation has the mojo to really take off, but the more trust metric toys there are out there to play with and learn from, the better.

Other social connections

I've been busy in lots of other ways too. Last night I had dinner with Till Kamppeter and a hundred or so other Ubuntu developers. We're working toward merging ESP Ghostscript into the main Ghostscript repository, something which our move to GPL-only licensing was meant to enable. We have a few details to iron out, but I'm very hopeful about improved user experience people should see as a result.


Last Tuesday I worked as an election judge (fancy name for pollworker) at a precinct up the hill in Berkeley. I've become pretty cynical about the political process, and participating in this civic ritual at the neighborhood level was a great anodyne to that cynicism.

I signed up largely out of concern for the mischief potential of all these fancy new voting machines. As it happened, our Sequoia Optech Insight jammed about three hours into the election, so we were back to putting paper ballots into a ballot-box, essentially stone-age technology. Most people seemed happy with that, and I'm pleased to report that our precinct was able to account for all but one of the 800+ pieces of paper we started with, at the end of our 14-hour day.

My faith in democracy is much restored. I can heartily recommend working at the polls to fellow Advogatans. It's a great way to become more involved with your community and your country.

Ghostscript leading edge is now GPL!

I have some great news to report. The leading edge of Ghostscript development is now under GPL license, as is the latest release, Ghostscript 8.54.

By switching to the GPL, we're reaffirming our commitment to the free software world. One big reason for this decision was to reduce the lead time between bugs being fixed in the development tree and users seeing the fixes, especially those users dependent on Linux distributions.

Moving forward, we'd also like to resolve the effective fork with "ESP Ghostscript," so that our development tree is suitable directly for use in Linux distributions without a lot of extra patches. It would be very nice if all the GPL patches could be incorporated into the main tree without any license restrictions (which means that we need copyright assignment), but realistically, we'll still have to implement an apartheid system of some kind, so that a GPL-only subdirectory exists that gets deleted out of our commercial releases.

As Raph Giles has posted recently, we're looking for a person to oversee this integration work, and to work more closely with the distributions and others in the free software community. Please let either of us know if you're interested. This might also be a good time to remind people of our "bug bounty" program, which pays a nice little bonus for fixing bugs in our tracker marked with the "bountiable" keyword.

We haven't been getting a lot of development work from the free community recently, but we continue to get extremely valuable testing, patching, and other quality assurance. Thanks again to everybody in the community for this - it's much appreciated, and putting our leading edge development branch into GPL is one way of saying "thank you." I'm excited about the potential for working more closely with people in the free software world.


I'm posting this from our booth at WinHEC in Seattle, having just seen the keynote by Bill Gates. There's lots of cool technology and devices, but overall I got the sense of a totalitarian vision, no more so than in the "FlexGo" initiative designed for developing countries, in which people don't buy so much as rent PC's, and rely on a DRM-mode access control system that shuts the computer off if they don't pay.

A lot of the stuff they showed at the keynote has to do with reducing the amount of manual configuration necessary. A lot of Windows Rally seems to be playing catch-up with Bonjour (formerly Rendezvous, and closely related to zeroconf, which is slowly but surely getting implemented in the free space). I think there's a lot of potential in this space, especially for first-principles research digging into the question of how much manual configuration is truly needed, as opposed to piling hack upon hack.

Even though this is a Windows-centric conference, there are some developers who really grok the cross-platform and open source worlds. One app (which has asked not to be named) uses WxWidgets, and they're even considering OCaml. One of the main things holding them back from that would be the wx bindings, which currently only exist in very crude form. That's got me thinking again about choice of languages, and I'll probably be blogging about that. Among other things, I should take another look at wx to see whether it's Good Enough(tm) to build the cross-platform GUI stuff I need, or whether I should keep going with my own very lightweight C abstraction layer (check out the darcs repo if you want to play with it).

High DPI

One of the features promised for Vista is support for high-dpi displays. In the December beta of Vista, I played with setting the dpi to 192, and the results were terrible - in many cases, fonts were scaled doubly, once by being sensitive to the dpi setting, and again by the compatibility-mode scaling. The February beta was a lot better, so it's possible that it will kinda sorta work by the time Vista ships. That said, Samsung is here showing their family of flat panels, and none of their panels push dpi past what was widely available a couple years ago.

Apple has also been making various noises about high-dpi applications, most notably David Hyatt's blog entry on high-dpi web sites. There are all kinds of crufty ways of detecting whether the browser is high-dpi, based on CSS3 selectors and so on, but there's no clean simple way to do it.

David obviously can't say much about Apple's future product plans, but you can probably read between the lines when he mentions his Dell laptop with 1900x1200 (145dpi) resolution, not to mention the fact that he's working on this stuff at all. Apple is in a good position to innovate here - it would fit the pattern they set with 802.11, FireWire, combo drives, and more than a few other things.


I had a great time in the Netherlands - both working and having fun. A highlight of the trip was meeting great people like Dave Crossland (minimal web presence) and Jeroen Janssen.


There are a number of basic algorithms needed for any serious curve application, including stroke offset, intersection, and conversion to lower-level operations for rendering. The standard representation for curves is, of course, piecewise (cubic) Beziers, and in this representation the implementation of all these basic algorithms is reasonably well understood.

However, these problem can't yet be considered solved in the free software world, because there is a lot of software out there that implements them badly (including FontForge, which I'd really like to see improved), and there isn't a really good library out there that you can just call. Rendering, yes, Beziers make that really simple. Stroke offset and intersection, though, are considered pretty difficult in the Bezier formulation. Offset, in particular, has a well-deserved reputation for being numerically tricky when starting from Beziers. See Comparing Offset Curve Approximation Methods for a pretty good survey of the problem.

I've been spending lots of time with other curve representations, including the clothoid (spiral of Cornu). My main motivation has been to make a better UI for editing curves, but I'm starting to get the sense that they may be better for the under-the-hood tasks as well. While in the Netherlands, I worked out a closed-form equation (in Cesaro form) for offset curves of the Cornu spiral, and am inclined to believe that it's both simpler to code and likely to give better results (speed, robustness, accuracy) than previous methods.

I'm blogging this partly to test the waters for a collaboration. I can see that happening in a few different ways. Maybe there's someone out there who really needs a solution to problems like stroke offset, and is willing to consider a new approach rather than a rehash of existing techniques. Alternatively, there might be a bright student or two who really want to stretch their numerical and computational geometry skills, and want to work with a mentor who's put a lot of thought into the problem. Either way, the result is likely to be a journal paper and a codebase published under a nice free software license.


Dave showed me screenshots and so on from Xara, for which the source code has just been released. I tried building it on Ubuntu Breezy, but ran into just enough make problems to run out of patience. Even so, it looks very interesting. I was getting something of a "too good to be true" vibe from the preannouncements, but now the code is out there, and the people behind it are showing up at free software events like the Libre Graphics Meeting. This project looks like it may well transform the landscape for free 2D graphics tools.


I'm citing this 1744 book by Leonhard Euler in my thesis chapter on the elastica, and need the following bit of Latin translated into English. Anyone out there who can handily read this, or recommend someone else who can?

ut, inter omnes curvas ojusdem longitudinus, qu\ae\ non solum per puncta A \& B transeant, sed etiam in his punctis a rectis positione datis tangantur, definatur ea in qua sit valor hujus expressionis $\int {ds \over RR}$ minimus.

I have a pretty good idea what it says, but don't trust my own ability to get all the cases and so on correct. And some of the words don't seem entirely standard to me. "ojusdem"?


I'll probably be spending the first couple of weeks of April in Venlo, the Netherlands, visiting a customer site. It might be cool to meet up with some free software hackers and font people in the area.

To follow up on either (or both) of these, my best email address is <firstname>.<lastname>@gmail.com.


I have really fallen out of the habit of blogging, but I haven't exactly been a hermit like many of my other blogging lulls. I've been meeting up with quite a few people who have come through town (tor is here now), and generally keeping quite busy. Work, in particular, is hopping right now.

I come not only to bury auto*, but to praise it

I'm not at all surprised by the defenses of auto* in response to my rather harsh criticism.

Dom Lachowicz writes: I've yet to see a build system that attempted to fill auto*'s niche and fill it as well as auto* currently does. I agree completely, and perhaps my praise was simply too faint. The goal of making software building Just Work on a wide variety of Unix-like systems is extremely noble, and until auto*, it wasn't even obvious that it could be done.

I'd like to amplify even more. A lot of good free software is inspired by the existence of good proprietary software, in the sense that Gimp was inspired by Photoshop. If nothing else, the proprietary software represents an existence proof that it is possible to attain those goals.

I think this story applies somewhat to version control systems. We've had consensus for a very long time that CVS needed improvement and probably replacement, but it wasn't really until BitKeeper came along that the lightweight distributed version control systems (such as arch, darcs, and mercurial) started coming out of the woodwork.

Now, in the proprietary platform space, build systems are very slick, but none of them give a rat's ass about portability to other platforms. To the contrary, the nicer an IDE is to work in, the less likely the developer is to escape the golden handcuffs. Lock-in is the highest goal. If we're going to create a much better build system, we have to look to ourselves for the inspiration, because we're not going to find it anywhere else. auto* was the first great existence proof, and I think it is high time for others.

Andy Tai and others call for incremental improvement to auto*, including a gradual phase-out of M4, but, with David Turner, I'm not sure that's really feasible. I believe a program of incremental improvement to auto* will never really be able to reduce the overall system complexity. And I do believe that a much simpler system is possible, especially without the demands of adhering to M4, least-common-denominator make, and least-common-denominator shell.

I admit I did overstate some of my original points for the sake of rhetoric. There are, indeed, good reasons to use other compilers than the GNU toolchain. Ralph Giles takes me to task for not acknowledging the importance of Solaris, but for the applications I'm personally most interested in building (font editors and the like), these vendor Unices are vastly less important than native Win32 support.

Dom writes: Regarding auto*'s tendency to work around deficiencies in ld/cc/nm/etc..., all I can counter with is "we don't control the horizontal and the vertical". In response, I ask: Who does? Bill Gates? Maybe after he figures it out we can try to clone it?

I'm not calling for violent overthrow of the auto* hegemony. I am calling for:

  • A profusion of prototypes of new autoconfiguring build systems, much like the distributed version control systems we've seen come out in the last couple years or so.

  • A careful look at which aspects of make/ld/package managers/etc are holding us back, and clear goals enunciated about how they might be fixed.

  • A more quantitative approach to thinking about building, perhaps empirically measured in challenges, where students are forced to use the tools to build and package a trivial app for Linux, Mac, and Windows platforms, and entries are scored based on time taken, defects in the results, and so on.

I've had a strong enough long-term interest in this field that I am likely to make one such prototype myself. One reason I'm blogging about now is to gauge the waters, to figure out whether there are other people thinking along similar lines, or whether I'm pretty much just pissing into the wind as far as the broader free software community.

auto* delenda est

David Turner (freetype)'s recent post in response to titus reminded me of my own auto* aversion. In sum, I think auto* represents everything that is bad about a free software project.

Don't get me wrong, auto* was (and still is) a tremendous improvement over the bad old days of hand-editing makefiles just to have a chance of having your software build. But it is well past time to have designed, implemented, and deployed a better alternative, and I don't see too many good signs of that.

What's wrong with it? Let me enumerate the ways:

1. It's way too complicated. Good software and free software, not to mention good free software, run on simplicity. auto* does not have this quality.

2. It's implemented in bad languages. One bad language would be enough, but M4, (portable) make, and portable shell? There's a good reason nobody else has even attempted writing an app in that combination of languages.

3. Original goals are no longer very relevant. In the bad old days, there were lots of vendor Unices and other strange build environments. Today, in the *nix world, there is just the GNU toolchain. The amount of actual diversity that needs to be configured around is minimal.

4. It doesn't solve real-world portability problems. For many users, getting programs to build on Windows is at least as important as compiling on an ancient MIPS running Ultrix, yet auto* isn't much help at the former.

5. Bad error reporting. In "configuration science", one of the overriding goals should be production of clear and meaningful error messages.

6. Lack of overall systems thinking. Much of what auto* does is work around limitations in tools such as sh, make, ld, package managers, and the like. If some of these other components are better places to solve configuration and build problems, let's do it there rather than twisting ourselves into pretzels trying to work around them. Apple had the guts to extend ld in several important ways, including two-level namespace support. Why are we still stuck with the clunky late-'80s approach copied from old vendor Unices?

It's been clear for a long time that CVS needed replacing, and now we have a variety of great alternatives, some of exhibit that classic, simple, do-one-thing-and-do-it-well free software philosophy. We should have something similar for the problems that auto* solves.


I find myself posting from Japan once again. Why is it that I'm more likely to find a free moment here than back at home in Berkeley? Anyway, it's nice and cold, and I even got to see some snow up north in Matsumoto.

The happiest baby on earth

I took a picture of Alan when he was a baby, and for a while it was one of the first-page hits on Google Image search for the keyword "happy". Over time, several people have asked to use the picture.

Most recently, it graces the front page of UC Riverside, where it is used to illustrate a research study on the nature of happiness. That makes it official, when he was a baby he was the happiest on earth.

The funny thing is, the day I took that picture, he was also most unhappy. We were packing for a move, and he was quite cranky that we were paying attention to all these boxes and things instead of him. I took a break for a few minutes, and he was soooo happy, I decided to take a picture. He was still happy to be the focus of attention, and I think the pic shows that.

Nokia 770

rillian brought his Nokia 770 when visiting here, and it seems really cool. The kids liked it, as well - Alan surfed to the neopets site and was able to log in, and Max made a drawing with the sketch (primitive paint) app.

My take is that the form factor is a winner, but I think I'll wait until the second generation to actually get one. The CPU is quite pokey by modern standards, and memory is tight.

It does run Ghostscript right out of the box, though! It seriously looks like it's a lot easier to develop for than your usual handheld.

More trust metric

I got a gratifying response to my trust metric rant in the last post - a couple of emails, some blog comments. It's clear now that I need to do a more detailed writeup of exactly how to implement the eigenvector-based trust metric in the context of a large Wiki.

Pete Zaitcev writes: One half is spam and abuse, and other half is that conventional, highly credible and trusted wisdom is simply wrong. I'm not sure exactly what he means by this, but it may have something to do with the fact that, from the perspective of approximately one half of the population of this country, approximately the other half is under the spell of a mass psychosis in which the usual rules of reality simply don't apply anymore.

It's not clear to me how a large wiki should handle this situation. One intriguing possibility is that the subgraphs of sane people and deluded people both form cliques, so that when a sane or deluded person is logged in and the trust metric is computed from their node, they see a version of the page that is factual and objective, or conforms to the parameters of their delusion, respectively.

The Clever search engine from IBM research has an interesting take on this issue. While PageRank and the Advogato trust metrics compute the principal eigenvector, they also compute some of the others, resulting in "clusters". They report, for example, that the second eigenvector link graph for webpages on abortion neatly separates pro-life from pro-choice. Indeed, this very eigenvector is likely to correlate very strongly with the sane/deluded distinction described above. The sign of this correlation is, of course, left as an exercise for the reader.


My teeth are a bit sore. Turned out the cavity I was to have filled today had more decay than expected, so I get to have a crown instead of a filling. Could have been worse, it didn't go into the nerve, so I don't need a root canal.

Time for a website for free font development?

With all the recent activity in free font land, I decided to set my ideas down "on paper", and posted a thread over on typophile. I chose to post it there rather than here because I want the input from people in many different communities, especially type wonks. If you're interested in free fonts, whether as a user or as a developer, head on over and add your 2 {(euro )?cents|pence|yen|whatever}.

Time for a trust metric enabled wikipedia?

I see that Wikipedia is having some well-publicized troubles with vandalism and the like. This will be a somewhat bittersweet response.

The success of wikis has taken a lot of people by surprise, but I think I get it now. The essence of wiki nature is to lower the barrier to making improvements to content. The vast majority of access-controlled systems out there err strongly on the side of making it too hard. The idea of a wiki is to err on the side of making it too easy, and to lessen the pain (somewhat) of undoing the damage when that turns out to be a mistake. In cases where that doesn't work out, I think the solution is to make the decision process of whether to grant write access a bit more precise, so you can still err on the side of trusting too much, but you don't have to err quite as often or as badly.

In that regard, the trust metrics designed and implemented for Advogato are a near-perfect match for the needs of a Wikipedia-like project, but for the most part, nobody is paying much attention to my ideas. Yes, I am bitter about that. I've written them up in a howto and some draft papers, arguably not as polished a presentation as the ideas deserve, but still comprehensible to somebody motivated to understand them. I've implemented them and released the code under GPL. That implementation is too tied to the somewhat quirky mod_virgule design, but adapting and modifying is what free software is all about, no?

So I haven't exactly gift-wrapped the trust metrics and presented them to the world on a silver platter, but they're not sitting at the bottom of a locked file cabinet in the basement of the local planning commission either. With Google now worth a brazillion dollars, due in large part on the success of their eigenvector-based trust metric, and with the problems of spam and abuse showing few signs of just going away on their own, you'd think there'd be more interest in creative, high-tech solutions to the problem.

Let's say for the sake of argument that there's a 50% chance that I'm a raving moron when it comes to this stuff, that my belief that a trust metric would go a long way to solving problems such as Wikipedia's is just plain wrong. Say there's also a 50% chance that there are practical problems I don't forsee, so, while the basic ideas might be valid, they just won't work on a project like Wikipedia. Of course, you can dispute the exact numbers, but that leaves something like a 25% chance that it really would be worthwhile for someone to invest the time and energy into making it happen. How much is Wikipedia worth to people? How much is the idea of decentralized collaboration, especially so that you don't have to rely on "content serfdom" to get the good stuff?

Of course, the Free Software Way(TM) would be for me to pick up a shovel, dig in, and implement a trust metric enabled wiki myself. Well, pardon me for ranting, but in this case I believe the FSW is just plain dysfunctional. A large part of the reason I'm reluctant to invest much more of my own time and energy is the tepid reaction to the work I've put in so far. How is it that a community can generate dozens of IRC clients, me-too distributions, window managers, and PHP bbs engines, and yet leave the development and implementation of the Advogato trust metrics almost completely ignored?

Wow, even I am amazed at the intensity of that rant. I did say this post would be "bittersweet", but so far it's been pretty much all bitter. The sweet part is basically that I have faith that, in time, the Advogato trust metrics will be understood and implemented as widely as deserved based on their ability to resist abuse. Free software development, in particular, operates on a pretty slow clock. My last post contains a striking example - the roughly many year lag between my release of a prototype watercolor simulator and the inclusion of the ideas in an actively developed app.

And already, I see some tentative signs of that. The Wikipedia development boards have some discussion of "trust metrics," although I don't see much evidence they actually understand the power of Advogato's. Additionally, there is some academic work starting to build on my own, including Paolo Massa's evaluations of the various extant trust metrics, and Daniel Stewart's "Social Status in an Open-Source Community", published very recently in the American Sociological Review.

And who knows, maybe even this post, despite the bittersweet tone, will inspire someone to take another look at my trust metric ideas. Hopefully somebody who has the technical ability to implement something a little more sophisticated than the usual PHP hash, and whose idealism about free culture and individual-centered web content has not become quite as jaded as my own. If someone out there were to do a nice job implementing an attack-resistant wiki, that would do wonders for reinforcing my faith in the community.

403 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!