I spent most of the evening reading "Stupid White Men", by Michael Moore. It's a good book.

language design

I'm not sure why I've been thinking so much about language design. Paul Graham says a lot of interesting things that challenge the conventional wisdom. I guess that's the answer -- having read through his essays a couple of weeks ago, they have provoked thought.

One of the things that Paul Graham says is that libraries are really important. Most "language designers", especially academics, fail to take this into account. Graham raises the possibility that, in the future, libraries will be somewhat independent of the language. Currently, the choice of language and libraries is very closely bound together.

Of course, there are already a bunch of things that move us toward this goal. One such is CORBA, which most people know as a tool for making crappy network protocols. The goal of CORBA is noble: to allow systems to be built from components in lots of different languages with a magic tool called an ORB. In-process, the main function of an ORB is to translate between different object protocols.

However, in actual implementation, CORBA has earned a reputation for being painful, bloated, and inefficient. ORBs themselves tend to grow very complicated, for reasons that are still not entirely clear to me. I think a lot of the problems have to do with people trying to use CORBA to build crappy network protocols. That's a much harder problem, and impossible to do right within the CORBA framework, so people keep piling on more layers in the attempt.

Another very interesting thing is SWIG, which has basically the same goals as CORBA, but without the crappy network protocol part, and with a focus on dynamic languages rather than C++. Quite a few people use SWIG, apparently, but I am yet to be won over. I think my uneasiness rests with the fact that SWIG tries to make language bindings easy. I don't really care about that. What I do care about is making language bindings that are really good.

I find myself liking the Python object protocol (which is the central part of the Python/C API). It's not a tool for creating N^2 adapters between N object protocols of different languages. It's just a single object protocol. What makes it interesting is that it's a particularly good object protocol. It's not hideously complicated. It's not especially painful to write for it, although trying to do objects in C always seems to result in a lot of typing. It's reasonably lightweight - the object header overhead is typically 8 bytes. It seems to be quite powerful - just about everything you'd want to express in Python can be done in the object protocol. In fact, that's pretty much the way Python is implemented. A special delight is that it's debuggable, using standard tools like gdb. This is something that very few dynamic languages get right.

In short, it's everything an object protocol should be, except for one thing: it's pretty darned slow. For many applications, this simply doesn't matter. Either you don't care about execution speed at all, or you do but the object protocol overhead isn't a major factor, probably because you're using the library to do the "real work" with a relatively coarse grain. However, if you want to use the protocol for fine-grained object invocations, performance will suffer.

It's not at all clear to me how to fix this. Trying to optimize the thing for speed is hard, and will certainly increase complexity. For one, you start needing a static type system to be able to say things like "this is an unboxed integer". Python (and its object protocol) is very purely dynamically typed, and this is a big part of what makes it so simple. Maybe the answer is to accept that the object protocol itself is slow, and just keep making more and better libraries, so that you amortize the overhead of the object protocol over larger and larger grains of computation.

I don't know how practical it is to use the Python object protocol without the Python language. The two are pretty tightly coupled, and for good reason. But it's an interesting concept.

One of Paul Graham's quotes (from this) is that "Python is a watered-down Lisp with infix syntax and no macros". I see what he's trying to say, but also feel that he's missing something important. I think the Python object protocol is a big part of what he's missing. Lisp was never good at interfacing with other languages, in particular C, and the result is that has sucky libraries, especially for things like I/O. Python fosters high quality wrappers for C code, which basically means that the Python gets to participate actively in the world of C-language libraries. That, I think, is pretty damned cool, and more important than most people think.


I finally got off my ass and did some very basic maintenance on Advogato. In particular, the intermittent sluggish performance should be fixed now. Also thanks to Gary Benson for a memory leak patch - I'll be applying his XML-RPC patch as soon as I've had a chance to review it, which should be Wednesday.

I'm going to put aside a tiny but steady amount of time for Advogato improvements. This means, of course, that I'll need to prioritize the things on my wishlist. DV's suggestion to make interdiary links bidirectional seems really nice - it sounds like it will add considerable richness without disrupting the existing structure.

Another change I'd like to make soonish is to render real names in most contexts. The nicknames are cute, but I think they don't scale well. Of course, this kind of change is not rocket science, but those things are important too.

I want to do some kind of "rooms" thing, to make it less intimidating to post articles. Badvogato does this already. I'm thinking rooms for news, entertainment-type things (books, movies, etc), and so on. It could work nicely with custom views.

On the rocket science front, I think one of the most interesting things to do would be to run a principal eigenvector-based trust metric over the data, in addition to the current network flows. The attack resistance would be about the same, but the result would be a real-valued ranking rather than the current boolean yes/no (hacked up to be four-valued by repeating the runs). The main advantage is that these rankings would be deterministic and stable, which would solve one of the big user complaints about Advogato's trust metric. People don't like it when their color suddenly fades for no apparent reason.

On the flip side, it would be quite fascinating to run the Advogato trust metric on the Google data. This project would, I believe, make an excellent submission to the Google programming contest. The API to the trust metric engine is actually quite easy to understand. I think this is a doable project for even with modest programming skill. I'm also quite willing to release tmetric.c under terms compatible with the contest. Hint, hint.

All of my trust metric ideas are public domain. Obviously, this is not the case for a lot of the work in this field. In particular, I wouldn't be surprised if Google felt that principal eigenvector-based trust metrics would infringe their patent. Even if so, it's likely that the research exemption would apply for Advogato itself.

In any case, it's a moot question for now, because I really don't have time to code any of this stuff up.

More on trust metrics

I had a great discussion with Roger Dingledine tonight. Among other things, we were talking about my design for an attack-resistant peer-to-peer network infrastructure. My "stamp trading" idea buys you attack resistance in the sense of being able to reject spam email, but unless there's a good algorithm for setting exchange rates, it doesn't prevent denial of service attacks. Roger has been doing a lot of thinking about reputation, and there are quite a few interesting parallels between my design and his ideas for evaluating reputation in remailer networks (see the FreeHaven papers page for links to his work).

I have some fuzzy ideas about how to set exchange rates for stamps, but so far no hard analysis. Perhaps the most exciting thing about my recent breakthrough in analyzing PageRank is that I now have two powerful tools for analyzing attack-resistant systems: network flows (which I have had for some years), and random walks (which I have only had for a couple of weeks). I am very excited that random walks may be just the tool I need to crack the stamp exchange rate nut.


Alan turned six yesterday (actually two days ago, as I'm posting this after midnight). As we expected, now that he's got the motivation to read on his own, he's making incredible progress - he can read sentences containing words like "librarian" fluently now. My guess is that he will transition from learning to read to just plain reading within another couple of months.

He's very interested in the concept of infinity, and the idea that infinity plus one is still infinity. I made the mistake of brining up the fact that there are in fact different infinities, aleph-null being countable, and the uncountable ones being bigger. He insisted that I explain this to him (including a very rough outline of the Cantor diagonalization argument). When I was done, he told me, "I didn't realize numbers could be so boring until now". But I don't think I've soured him for life :)

Max is also developing very rapidly. He's talking up a storm now, and is gaining more and more grammatical concepts. We just noticed that the singular/plural distinction is now very reliable. For the most part, he's still not doing full subject-verb-object sentences, but he still makes himself pretty well understood. A couple of weeks ago, when I got home, he greeted me with "help -- puter -- stuck". :)

5 Mar 2002 (updated 5 Mar 2002 at 07:16 UTC) »

Thanks to everyone who wrote - my last entry got a very satisfying number of responses!

While I do find Scientology fascinating, the main reason I'm interested in it now is that they have a very strong track record of using media in innovative ways to further their goals. Thus, if a metadata system claims to be "attack-resistant", then its ability to deliver both pro- and anti-Scientology link is a very good test of that claim.

Of course, it's hard to evaluate a search engine based on "scientology" results alone. For one, it's likely that the operators of the search engine will either bias the results the way they feel about Scientology, or to counteract a bias they percieve. Different search engines will approach this differently. Altavista, which is basically a pure keyword engine, reports about 48 pro links before you get to the first anti. MSN's search does quite well: 4 of the top ten links (5-8) are high-quality anti sites. Based on various innuendo I had heard, I expected Earthlink to do fairly badly. However, they just rebrand Google, so the results are identical.

My attack resistance result on PageRank doesn't say anything about the rank (more familiarily known as "googlejuice") of a particular page. Rather, it bounds the total googlejuice that can be captured by a determined attacker. I haven't figured out the implications of this yet.

I didn't get as much time as I would have liked today to write - there are always other things that come up. I'm trying my best to ignore my email and tell everyone else to bugger off, but it's not easy.

One mystery I haven't been able to resolve: why is it that anti-Scientology websites have such atrocious HTML layout?

speaking of search engines...

...I notice that they're becoming quite a bit richer in the document languages they search. It used to be just HTML, now all good search engines seem to be able to handle a dozen or two of the most popular formats. I am, of course, professionally interested in their PostScript interpretation capabilities. That link itself isn't all that interesting. What will be more so is to see how various search engines handle it.

<mischievous grin/>

Nearly Headless Nick

Nick is the recycled old laptop now functioning as an 802.11b access point in my studio. By now, I feel I would have been far better off buying an Airport, or one of the Linksys or D-Link jobbies. However, it was kind of fun to get 802.11b running. In the past, it's been kind of flaky, especially in Master/Managed mode, so I just ran it in Ad-hoc. But after an apt-get upgrade toasted the PCMCIA on the system, I upgraded the kernel and & lt; a href="http://people.ssh.com/jkm/Prism2/">prism2</ a> driver to their latest stable versions, I find that it generally works quite well.

One trick was to change the cardmgr options to "-f", so that the init scripts would wait for the cards to initialize before starting named and dhcpd. Otherwise, they'd start up without the IP address being configured, which obviously wasn't happy.

I played around with the power-saving modes, but couldn't detect an actual effect on battery life. The card (a DWL-650) seems to run pretty cool, and I expect that even at full power, it sucks down a lot less juice than a 900MHz P3, big nice LCD, and hard drive. Selecting power modes did cause interesting log messages on the AP, including what looked like a reference count mismatch.

Nick is also a handy backup nameserver (after having gotten burned a few times, I now pride myself on running the one of the best amateur DNS services around). At some point, if I can find a worthwhile P2P network that can run with minimal resources, I'll host that on there as well. New hardware is amazing, but old can be fun too.


I had a major breakthrough over the weekend. Inspired in part by the recent article about Scientology taking over all the top Google spots, I went back to my metadata chapter, which includes an analysis of the attack-resistance of Google's PageRank algorithm.

I now feel that I understand PageRank much more deeply. For one, I have a proof outline of its attack-resistance, which, as it turns out, is rather similar to Advogato's. There are substantial differences, though - one of the very nice features of PageRank is that it's deterministic and stable (ie, small changes to the graph cause small changes to the resulting rankings).

So now we have two known attack resistant trust metrics. One is based on network flow, another on principal eigenvalues - both highly classical algorithms, both reasonably tractable and scalable. This is intellectually a deeply satisfying result.

Based on my analysis, I'm able to provide reasonable answers for the following questions:

  • How did Scientology succeed in subverting the PageRank algorithm?

  • Why did registering a large number of domain names help them so much?

  • What exactly does "attack-resistance" mean in the context of Google?

  • How can PageRank be manually fine-tuned (with fairly minimal effort) to be even more attack-resistant?

  • What is the justification for the ||E(u)||_1 = 0.15 "voodoo constant" in the PageRank paper?

For the answers to these questions, you'll have to read the metadata chapter of my thesis. It's not quite written yet. A large part of the reason I'm posting this is to fish for requests to get that chapter written. So even if you find the above questions deathly boring, go ahead and send me an email feigning interest.

Python and Lisp

Someone else posted a link to Norvig's page comparing Python and Lisp. This is an excellent page, and really highlights how similar the two languages are, aside from syntax. This isn't all that surprising, as I've written some very Lisp-flavored programs in Perl in my day (here's one example from my thesis).

As I posted before, one of the turnoffs for Python for me is the really poor speed showing of the current implementation. What makes this even more galling is the fact that we've had Lisp compilers for some time now that are within striking distance of C, speed-wise. Hell, I even wrote one myself, about 18 years ago. So why is it that we still don't have an implementation of Python anywhere nearly as good, in this respect, as ancient implementations of an ancient language?

Another person who has a lot to say is Paul Graham. I linked his taste article a couple of diaries ago. I don't agree with everything he says, but it's all interesting.

One of the things I do agree with is his assertion that libraries are critically important (see section 6 of popular.html). If you believe this, then one of the best tests for the vitality of a programming language is the availability of libraries for that language. Python has one of the more interesting stories around today, in large part because it's relatively easy and clean to hook in C code. I think the fact that distutils can be used to cleanly package mixed Python and C is more important than most people give it credit for - in most other languages, mixing creates nice headaches in the build/package/distribute department.

If arc gets this in a deep way, and also gets a good implementation early on, then it will be an interesting language. I look forward to seeing how that goes.

1 Mar 2002 (updated 1 Mar 2002 at 06:29 UTC) »

Thanks to tk for the response to my inlining query. "static inline" does indeed look like it might be the right answer. I'll dig into portability issues more and let dear diary know...


I am obviously very heartened to hear that the Gimp-Print project is having good success with Even Toned Screening. As it happens, I spent some time today tuning the algorithms for a paying customer. The result is here, in case anyone is brave enough to dig through the code.

The biggest advantage of EBS is the new "tandem screening" mode, in which all planes are screened at the same time. If you're making light blue, this means more pixels covered by cyan or magenta dots, and less either white or both. The result is even better smoothness, and also a slightly expanded gamut.

I've looked at the Adaptive Hybrid screening in Gimp-Print, and find it to be quite excellent. In addition, there's been a lot of attention paid to doing the 6-color separation to minimize the patterning from darker inks in lighter regions - it's a delicate balance between those patterns and making the page sopping wet with light inks. The result is that Gimp-Print's 1440x720 mode is overall a bit smoother than my rinkj prototype in solids and gradients. I bet that a large part of the reason why they're seeing an improvement in going to ETS is the inherent advantage of error diffusion over blue noise masks for highly detailed areas such as line work.

Color is a lot trickier. I've found Gimp-Print to be about as well tuned as one can expect from hand-tuning and visual inspection. However, I think moving to "real" color management using a spectrophotometer will result in significant improvements. One area, in particular, that I think will improve dramatically is the consistency between the various modes and resolutions. If you look at Gimp-Print's 1440x720 and 720x720 modes on an Epson 870, they don't match very well. The latter also has some color management anomalies: the transition between light and dark inks seems to have notably less color saturation than it should.

My experiments with Argyll have been very positive, but more work is needed, especially to make the process easier for mortal users (my experimentation involves lots of arcane command lines and hand-editing of raw spectro data files). As is common for me, I feel like I have a pretty good idea how to do it, but it's hard to find the time. Hopefully, we will be able at some point to justify more inkjet work as part of the Ghostscript project, but until then it will go at the more usual free software pace :)

One solution to this problem would be to recruit an apprentice. I'm open to this idea, but haven't done much in the way of active recruiting. It would be a fairly good deal - the job of "color apprentice" would bring with it lots of goodies, including toys (printers and color measuring equipment), a damn good education in color science, help with getting academic recognition for the work (published papers and credit towards a degree), and a modest stipend. If you know of anybody, put us in touch :)

In any case, I'm very impressed with gimp-print's accomplishments, and am pleased that we're able to share code and other results. This is the true spirit of free software.


I'm starting to really like Python. In fact, I've basically decided that all my "fun" programming is now going to be in Python, or C with Python wrappers.

Python, I think, is one of the few languages that lives up to the promise of huge productivity gains. One of the major wins is having lists, maps (hash tables), and strings right there under your fingertips. Another major win is a reasonably clean object framework. Yet another is a sane module and namespace organization.

At the same time, Python still has some serious practical problems. Among them is the fact that there is no one killer GUI toolkit. If you want stability and portability, there's Tkinter, which aside from those two virtues sucks pretty hard. wxPython seems to be picking up momentum, but it's still fairly immature, and it's also not clear that you get as good access to the underlying toolkit when there are three layers involved. PyGtk seems quite cool, but is not exactly portable. There are also the usual fit and polish problems of trying to run Gtk+ apps on other desktops, but those are by no means Python's fault.

Even so, there is no other language out there with a better GUI story than Python's, and some quite a bit worse (cough, Java, cough). With luck, at least one of the Python GUI toolkits will mature to the point where it becomes a truly compelling application platform.

Dare I mention it, but there's also the speed issue. By my quick benchmark, Python is about 200 times slower than optimized C for vanilla integer-and-array work. There's something more than a bit disconcerting about realizing that your 900MHz laptop has just become the equivalent of a 4.5MHz machine. Obviously, if you really care, you code the speed-critical bits in C, but I find this less than fully satisfying.

What I think has gotten me newly excited about Python is the realization that a lot of these problems simply reflect the relative immaturity of the platform, and will almost certainly be fixed over time. The core language is simple enough that it is realistic to expect that it will be implemented at least modestly well. If Python continues to mature as I expect it will, then it will become a powerhouse of free software.

If you're a Lispish type, you'll find this essay to be quite enlightening. In it, author Erann Gat describes how he "lost his faith" in Lisp, and is now happily hacking Python at Google.

One final thought: the primary reason that Tcl became popular is the fact that it was packaged with Tk. The fact that this was even possible at the time is a glowing testament to how far we've come since then.

extern inline

I just found out that "extern inline" is a complete disaster. It's present in both gcc and C99, but with opposite meanings. See the gcc status page, this bug-hurd ml post, this lkml post, and this bug-glibc post for more info.

I really want to have a portable way to specify inlining. Currently, the Ghostscript codebase uses macros heavily, to the detriment of readability and debugability. Inline functions would be a great way to unsnarl some of this mess, without sacrificing speed.

We can't possibly be the first project to have run into this problem. I know the Linux kernel uses "extern inline" extensively, but it's fairly nonportable to non-gcc compilers. Has anyone out there solved the problem of making inline functions a viable, portable alternative to macros? Anyone trying to read the GS sources will thank you!


I went to a talk by John Mitchell on his trust management work on Monday. It's somewhat interesting stuff, but very different from my own work on trust metrics. Basically all of the "trust management" literature is vulnerable to a failure of any single "trusted" node. To me, this means that the security of any trust management deployment scales inversely with the number of users. Since a lot of the motivation behind trust management over simpler systems such as access control lists is to make things more manageable as they scale, I have a feeling that "trust management" will remain primarily interesting to academics for some time yet.

In any case, the problems they're struggling with now are pretty much the same as the ones I struggled with during my internship under Matt Blaze at AT&T in the summer of 1996 - discovering credentials, being able to analyze the algorithms, managing hierarchical namespaces. It's important to publish your work, dammit! Thus, I have some new motivation to finish my thesis.


I've put the sources up on casper. Run "./rebar testproj/" to test. It will build an executable stored in /tmp. The code should be interesting to read, but it isn't functional enough to use yet. (no pun intended, of course)

Thanks to the people who responded to my post, and sorry if I haven't replied. Yes, one of the key ideas is memoization across invocations. This is indeed similar to what compilercache does, but I believe that the idea goes back farther, at least to Vesta and probably before.

Anyway, it continues to be interesting, and I wish I had more time to work on it.


I have nice happy feelings about Ghostscript these days. One of the things I realized lately is that a lot of our recent improvements, while not rocket science, are things that will make things much nicer for users. On this list I'd include autoconf, IJS, the recent fix for the bug which caused almost all Type1hints to be trashed in pdfwrite, replacing the completely broken x11alpha device with a mode that does real antialiasing (which has existed for a while in Ghostscript, but was apparently almost never used), accepting more broken PDF's, etc.

Of course, I love the rocket science, and am really looking forward to doing some hardcore 2D graphics work over the next few months, but this other stuff is important too. Ghostscript has long had a reputation for being difficult and unfriendly. I think that's about to change.


I did some fun work on jbig2dec over the past few weeks, as well. It now passes an important milestone: it decodes some of the freely available test files.

There's a wierd bug somewhere, though. The code exists to decode symbol dictionaries, but on all the test files, it seems to be able to decode a few hundred symbols, then goes haywire. It's very, very difficult to track down: it is impossible for a human to look at an arithmetic encoded bitstream and make any sense of it. I sent an email to the authors of the test files asking for help (traces from their encoder or decoder would help enormously) but haven't heard back from them yet.

Anyway, it's great that the project is live.


Another really fun project is rebar. Here, the goal is to make the build process for software a thing of beauty. The auto* toolhodgepodge gets the job done most of the time, for most Unix users, but nobody could ever accuse the thing of being beautiful.

So instead of thinking about features and also backwards compatibility with existing things such as make, I'm trying to distill the build process to its essence. One very powerful inspiration is the Vesta project, which takes the position that the build process is a functional program - the arguments are the source files and configuration parameters, and the result is the built package. This is a very different way of thinking about the world than make, which treats the build process very procedurally, basically scribbling in the directory where the source, object, and binary files are kept.

There are a ton of build system designs out there. One of the most important axes is: which language are the build scripts based on? In many, many cases, this language is one that already exists: scons and distutils use python, the original cons uses perl, ant uses xml+java, the original make uses Unix shell, auto* layers m4 and some other stuff on top of the shell, and so on. Some systems actually invent their own little language, such as jam.

What language is best? Obviously, using a real language such as python gives you a lot of power, but I also feel that it tends to bind the specification of the build process too tightly to the implementation. An important rebar goal is to decouple these as much as possible.

So the rebar design seems to be converging on a really simple functional programming language. Lists, maps ("hash tables"), strings, and files are first class data types. Your typical tasks of compiling and linking are built-in functions that map files to files. The language is dynamically typed. The fun part is that the implementation is lazy and implicitly parallel. This means that files don't get compiled unless they need to, and that you can get the equivalent of -j without having to do anything fancy in the build script.

The current prototype implementation is written in Python, and is now capable of interpreting the core language and compiling simple projects, with autogen'ed code and autoconf-style tests, as well. I've implemented -j, but there are two big pieces that haven't been done yet: first, assigning reasonable names to the generated files, and second, skipping the compilation if it's already been done (in a previous invocation). But neither of these is fundamentally difficult.

Python amazes me for its concision. The current prototype is all of 900 lines of code, yet it contains a lexer, parser (recursive descent), core language interpreter, and parallelizing process spawner.

At some point, I'll want to include more people. The free software mantra is, after all, "release early and often". But at this point, I'm much more interested in attracting people with similar goals regarding the beauty of the end product, and the knowledge and experience to understand what I'm trying to do. My choice of a lazy functional language as the basis for rebar is extremely unlikely to be appreciated by the pimply-faced Slashdot hordes, and I don't want to be deluged with "how to I get program X to compile with library Y on Z Hat Linux?" questions.

Friends and family

I spent a few hours on Monday with zooko, his wife Amber, and their child Irby. It was great fun - we talked about lots of interesting things. Heather and the kids got a chance to meet them as well, which was great.

Alan continues to thrive on the Singapore Math workbooks. Reading them over, I can see why; they're really well done. I can recommend trying them to any parent of a gifted child.

Max's language development is proceeding at an incredible pace. A few weeks ago, I noticed that he was conjugating verbs: "jump, jumping, jumped". Over the past two weeks, he's mastered possessive ("Maxie's, Daddy's"), compound nouns ("barn owl", as opposed to simply "fire" for fire engine last month), complete sentences with subject and object ("put it here"). And he takes such delight in mastering every new thing!

Max is also really into Blue's Clues. I noticed today that one of the videotapes is starting to wear out. It's an interesting show for a "grow-up" to watch, because most of it is animation that's been composited with footage of just Steve, alone, in front of a bluescreen. (who's creen? blue's creen!) You imagine Steve staring and talking into space, pretending that's there's actually a blue dog, a talking salt and pepper shaker, and similar other stuff there. He does a good job of it, though :)

Found at the always interesting Hack the Planet: an essay by Paul Graham (of "On Lisp" fame) on taste.

12 Feb 2002 (updated 12 Feb 2002 at 08:29 UTC) »

Not to keep people in too much suspense, the missing glyphs are J, U, W, and w. nomis came closest, and I tend to agree - the U could stand a little improvement.

In any case, it's deeply satisfying to hack on this font.

Singapore Math

Heather got the Singapore Math workbooks for Alan, on the recommendation of several other parents of gifted children. Well, for whatever reason, he loves them. This evening, he used wanting to do more Singapore Math as an excuse for pushing back his bedtime. I felt I was in one of those cheesy infomercials.

GPL viols

Now I've experienced firsthand the unpleasant, but all too common, practice of repackaging free software as Win32 shareware. This might even be legal if it were done with some care, but apparently the culture of scam artistry is too deeply ingrained for people to even try.

Yes, it pisses me off.


The recent thread on "what makes a powerful programming language" was one of the dumbest ever. Characteristically, the dumber the article, the more response. This phenomenon isn't unique to Slashdot, of course, but they do seem to choose this path fairly often, or else be awfully sloppy. (and what's the difference, really?)

Of course, everybody still reads Slashdot. One of these days, I'm going to get around to tweaking Advogato to try to encourage many more people to post news articles. Maybe there is hope for a reasonable alternative.


It looks fun, but I'm not sure I'll have time to attend. It is disappointing that the Chord guys don't seem to be represented. That's without question one of the most interesting things happening in that space.


I've always wanted to design a font. That's actually why I started gfonted. That project is now dormant, but it spawned gdkrgb and libart, so it was certainly interesting.

Anyway, when pfaedit crossed my radar screen, one of the first things I tried is tracing the font a specimen sheet I had fallen in love with. The program is crude, let me tell you, but strangely usable anyway. Before too long, I had all of the letters in the original sample, uppercase and lowercase, drawn.

There are some letters which are standard in the alphabet now, but weren't standard when the specimen sheet was printed (only a bit more than a hundred years after Gutenberg). I would have to draw these myself.

So over the past few days, I did. I'm pleased with the results. None of the people I've showed it to have been able to guess which four letters I drew from scratch. Can you?


I had a dream last night that I was asking miguel about assemblies. Inspired by the dream, I stopped by #gnome and had a very nice chat with miguel, bjf, and others about the topic. I still haven't found a good document that explains them - bjf's advice was to play with Microsoft's implementation. In any case, I'm convinced that there are some very good ideas there. Most of the time, when we build larger systems out of components, we just throw the pieces together and cross our fingers. Sometimes it works, sometimes Murphy's Law reigns. The idea of assemblies is apparently much more systematic.

As an old-timer, the name tickles me a bit. The name "assembly" used to refer to assembly language, which is basically obsolete these days, so I guess Microsoft decided that the slot in the namespace was free. In any case, it evokes a much older time.

Miguel is certainly right about one thing: Microsoft has a lot of very smart people who are paid to think about difficult problems. To ignore their work, for any reason, is stupid. Embrace and extend!


Jbig2 implementations are starting to dribble out into the real world. We had a customer file come in recently that was created by Cvision's Cvista product. I hacked around a bit yesterday on jbig2dec, which is our (GPL) decoder project. It's fun stuff, and I hope we decide to ramp up the development on it. rillian is the main developer on it, but he hasn't had too much time for it either.

