Older blog entries for raph (starting at number 298)

Ghostscript

8.00 is out!

Other things

Obviously, I haven't felt like blogging much the past few days. I've had a bit of a sore throat, which ordinarily wouldn't be so bad, except that I'm pretty sure it makes my sleep breathing worse.

I have been hacking a little on the new Fitz code. That's fun.

Congrats

Late but heartfelt congratulations to David McCusker for his new job at OSAF. Sounds like he'll have a lot of fun there and put his skills to good use.

Thanks to everybody who expressed concern over my last entry. Things aren't actually all that bad (and, in fact, show signs of getting better), but I wanted to write something rather than just keeping quiet.

Ghostscript

The 7.33 beta is out, and will be at the usual download sites soon. If there are no real problems, 8.0 should be out within a couple of days.

Another thing that's soaking up my time is writing a paper on my Even Toned Screening for the Electronic Imaging '03 conference. I'm pleased with the way it's turning out, and will post a copy tomorrow.

Googlejuice and images

Heather and I have both seen an uptick in image traffic, mostly from our family website. We're also getting a trickle of requests to use the images. Looking at the server logs, it's obvious that all this new traffic is coming from Google Image search.

I think this is because of a combination of high Googlejuice and the fact that I tend to name pictures descriptively. We're on the first page for these search terms: pinata blue-whale max strawberry happy easter-egg lily, and pretty high up for nursing birth grampa penguin yawn. In fact, the picture of Max yawning can be found on numerous bulletin boards as the symbol of a boring topic :)

IBM battery

IBM has had over three weeks to respond to the feedback I sent them on the battery problem, and the "powerhouse of a reputation" language is still there on their e-commerce site. Meanwhile, I get a steady stream of emails from other people who have experienced battery problems. I take some satisfaction that my page is #1 on Google for "thinkpad 600 battery".

Based on their lack of response, I'm no longer recommending IBM laptops. It's a shame, because they're otherwise pretty good. I'm also not sure there are any laptop manufacturers that are better.

Some good experiences

But not all companies hang their customers up to dry. I've had absolutely excellent experiences with Crucial memory and Sears/Kenmore's replacement parts service. In both cases, the website led me directly to the part I wanted to by, the price was reasonable (if not rock-bottom), and the item was shipped quickly. In fact, I ordered the RAM upgrade from Crucial on Friday night, and it arrived Monday morning.

Berkeley and archival URL's

Berkeley's official policy is to revoke Web accounts by default when EECS students leave. The short lifetime of URL's bothers me. I feel it contributes to the "digital dark age" (a nice turn of phrase from the Internet Archive. I expect slightly better from our universities, who I feel should be responsible as guardians of our cultural artifacts, rather than merely providers of services.

I'm inclined to raise a stink about this (including bringing it to the department). In any case, please use long-term stable Web space if it's available, rather than the transient hosting provided by Berkeley and other institutions with similarly short-sighted policies.

Life

Life is difficult just at the moment. The details don't belong in a public diary, but I'm willing to talk about it in person.

A consequence is that I've been even less responsive than usual to people. Apologies, as there's lots of interesting things going on. I'd love to look into Quixotic, respond to sej's interview questions, get out a proposal for a new IJS spec, and so on, but all these things will have to wait until the time is ripe.

So, in the meantime, I'm mostly working on things that are urgent for work (we're slamming down bugs for Ghostscript 8.0 and hope to release soon) and things which are soothing for my soul. Norm Megill quotes Rudy Rucker as saying that "Set theory can be viewed as a form of exact theology." I think that's one reason why it's so appealing right now. Life seems very inexact right now. I've been having a very nice discussion with Norm Megill, and have been exploring some ideas regarding definitional frameworks. I thought I had gotten it all figured out, but when I started looking closely at free variables in the definitional expansions, my first idea broke down. But at least I now know what the problem is!

Fitz

One of the other things I've been spending a little time on recently is the Fitz prototype. The most recent bit of code is a compact representation for Bezier paths, because the Libart one is pretty damn bulky.

I think what I'll do is get the new prototype to the point where it can render basic paths, text (using FreeType), and images, then release it. It's also important to document the design, but I haven't had strong motivation for that recently.

Single signon

There was a fairly contentious discussion on #p2p-hackers about the single signon idea I blogged recently. Aaron Swartz said that he couldn't decide if was evil or misguided.

In any case, my initial enthusiasm has faded somewhat. I'm feeling overloaded enough as it is, and this wouldn't help. The coding would be fun, but I imagine that dealing with clueless blog engine authors would not.

Does the blog world really want a spam-resistant authentication mechanism for blog comments? I certainly want to get a good reading on this before I commit any kind of time into implementing it.

Proofs

I've been tinkering with a Python verifier for Metamath for a few weeks, and finally got around to implementing the distinct variable constraints. It's under 300 lines of code, which I find very satisfying.

I find the whole thing a bit magical. Those 300 lines of code, plus a couple dozen axioms, effectively give you the building blocks for all of mathematics. Quoting from the Metamath book:

There you have it, the axioms for all of mathematics! Wonder at them and stare at them in awe. Put a copy in your wallet, and you will carry in your pocket the encoding for all theorems ever proved and that ever will be proved, from the most mundane to the most profound.

I personally feel that it's not quite complete, largely because you still have to know what you're doing to use the definitional framework. I have some ideas on this (surprise, surprise!)

Be warned, the code is a bit rough. I feel like I should add some comments and polish it some more. I've sent it to Norm Megill, and am eager to hear what he thinks of it.

Ghostscript

The 7.32 beta release is now official, with Windows and Macintosh builds as well. Hopefully, this means that 8.0 is imminent.

Thanks to everyone who's been helping with this, especially my colleagues at Artifex Software. It feels really good to be part of a team.

Vote

If you're in the US, please vote today. Especially if you were inclined not to vote, please consider voting for your Green candidates. The California governor's race is particularly ugly. From the major parties, we have an incumbent who's proved excellent at finding the line between merely aggressive fundraising and outright bribery; and a challenger whose inept campaign hopefully not a predicter of his performance in office should he win, not to mention his closeness to fraudulent practices in his family firm.

By contrast, Pete Camejo is somebody you can vote for without having to hold your nose. A strong showing for him will send an important message that many people are not satisfied with the status quo and want something better. In addition, he has a lot of good experience with finance and ethical issues in particular, which suggests he would make a good governor in the extraordinarily unlikely event he wins.

Davis is sufficiently ahead in the polls that you should feel comfortable voting for Camejo even if the thought of a Republican in the governor's office disturbs you. Even so, I'm more inclined to vote for the best candidate than the lesser of two evils in this election.

Please consider your Green candidates in local races as well. Green candidates can and do win offices at the local level, and these are often positions where they can do some good.

But no matter what your political beliefs, vote. It's how our democracy, however flawed, works.

Proofs and Types

chalst: Hrm. Unfortunately, Girard's book is just a little over my current threshold for accessibility. At this point, I really prefer Web-available resources, as well, largely because I am doing this as a weblog and want to encourage other people to follow along.

I know I've just labeled myself a lazy student, but I'm wondering if you could give me some hints nonetheless. You mention Z_2-style and Coq-style consequence relations. What is the big difference between them? I'm not used to thinking about logical systems in terms of the consequence relation, but do know what the term means. Of course, I know one difference is that Coq is intuitionistic, but I think that's not the main thing you're talking about.

I'm also having a little trouble grasping the idea of a lambda calculus with the same strength as Z_2. Is the trick to write functions as finite strings, analogous to a computer program to compute the function? This way, you quantify over computable functions rather than the imaginable universe.

PS: chalst: Michael Norrish has some notes he'd like to send you, but doesn't have a working email address.

I don't feel like I have much extra time to write, but I've been thinking of doing a longer essay on some of the controversies in mathematics, especially as they're relevant to proof systems. The most important ones are logical strength (from primitive recursion to ZFC, with at least Z_2 and HOL as way-points, and specialized need for systems past both ends), types vs. no types, and partial functions vs. total functions. In the "moderately important" class I'd put classical vs. intuitionistic logic. The philosophical issues don't do much for me, and it seems like it's almost always possible to work around the intuitionism when you need to. Finally, in the least important category, I'd put choice of primitive operators and axioms for the propositional calculus. They're all quite equivalent, but that doesn't stop people from arguing their favorite.

Hmm, I wonder whether this is a reasonably complete list, or if there's another big one I'm overlooking.

Trust metrics for spam-free blog comments

Dave Winer talks about a large-scale single sign-on system for posting comments and replies in blogs. In this entry, he mostly talks about the convenience factor of not having to type in passwords and other information all the time.

At the same time, there's a lot more awareness of the vulnerability of comment and reply mechanisms to spam, trolls, and other forms of abuse. So far, the blog world has been quite civilized, but as spammers become tuned into the prestige and popularity of blog placement (not to mention their role in boosting Googlejuice), this will undoubtedly begin to change. Heather likens it to the Internet pre- and post-AOL.

I just realized today that both problems can be solved quite easily using the trust metric technology that powers the diary rating systems in Advogato. Blog servers, acting as clients to this new service, would get a cookie from the user's browser (probably using Passport-style techniques), and get an auth token back, also with a one-to-ten ranking, if so desired. This way, anyone who's hooked into the blog world can post comments and replies, but abuse would be limited.

In a dream world, we would have a decentralized peer-to-peer identity service based on a cryptographically sound protocol, but in the meantime there are a lot of people implementing blogs with very crude authentication. I think it will be possible to hack something together quickly.

Once the infrastructure is in place, I can see it being useful for a bunch of other things, including authenticating backlinks, which are also spam-vulnerable. And, once the social net of bloggers starts becoming reified into a net-accessible graph, the stage is set for evolution to a more decentralized system. It seems like a good way to get rapid adoption of the trust metric ideas.

I discussed the idea with Bram and others on #p2p-hackers, and he had some good ideas about how to make a simple challenge-response protocol that would be both easy to implement over the existing infrastructure and reasonably secure against the most common attacks such as password sniffing. He'll probably want to post about that himself.

Kids

A couple of days ago, Alan was trying to gross us out at the dinner table, which is of course a normal six-year-old thing. But, being Alan, he tried to convince us that he had "bee larvae" on his plate.

Max really loves his "puter games" and is quite good at running the computer. In particular, he can usually quit out of a game by himself, which is even more impressive considering the atrocious consistency of keybindings in the downloadable Mac OS X games. He seems especially good at reading icons and other visual elements. For example, just today, he got some stickers trying to convey "no sweat" by the international "no" circle overlaid on some water drops. He didn't get it exactly, but we were pretty impressed by his interpretation as "stop raining".

We were talking about blogging yesterday, and together singing fragments of the Kipper theme song, with the words "Kipper the blog".

Proofs

chalst: Thanks for your highly relevant comments. Let me see if I can answer.

"You don't need to encode first-order terms as singleton second-order variables..." Right. Actually, thanks for reminding me about the need to quantify over things like total functions. I was pretty much only thinking about building them up as terms, for which the proof that the function is total follows directly from the structure. For quantifiers, probably the easiest technique is to put this predicate into an assumption.

"...you may not be respecting the distinction between the set of total functions and the set of function that SOA can prove total." This is entirely possible. Presumably, the important difference is that you you quantify over the total functions, but when you build up from terms, you're restricted to the provably total. I'm not sure yet how this might bite me.

"Finally, you need to check your translation actually captures the properties you are after." Actually, I may take the heretical position that this is not necessary. You may want to allow the translation to have essentially arbitrary properties. Of course, in many cases there will be a meaningful translation back from target language to source language, and it may be worth capturing that formally.

Your point about a rich set of datatypes is well taken. Basically, what I'm trying to figure out is whether such a set of datatypes can be mapped to a simple Z_2 axiomatization using a straightforward definitional scheme. I'm not even sure that growth in the size of propositions matters much: as I said, in most cases, you won't be doing the translation explicitly.

What's the best source to read about these things, especially the polymorphic lambda calculus systems with the same provability strength as Z_2?

Google

I poked around at Google Answers a bit. It seems like a useful service, but I don't see any reason to get excited about it. There are two features I might have hoped for.

First, I don't see that the Answers provided help, or even particularly influence, the results of ordinary searches. This is disappointing, because it seems to me that Google's trust metric has the best chance of working when there's high quality manual input, especially regarding the trusted "seed". It's entirely possible that Google has two completely disjoint sets of people manually reviewing Internet sites (one set for Answers, another for deciding the seed), but that they don't talk to each other. I'm not sure.

The second missing feature is a bit harder to define, but it has to do with whether Answers is a true community. There are glimmerings of this. For example, having all the questions and Answers posted, along with comment posting open to all users, is cool. Many of the questions are interesting, and I can see people hanging out there. Another pro-community sign is that ordinary users and Researchers share the same namespace in the account/login system.

Most importantly, a lot of the Researchers seem to know each other, so there is something of a community there. But users don't really get to take part in it, even when (or perhaps because) they buy in monetarily. Part of the problem may be the way identities are concealed

If Google (or somebody else) figured out a way to build a real knowledge-seeking community, with money lubricating the system, I think they'd have something big. As it is, they have a reasonable service with a solid-gold brand behind it.

I feel like about 150% of my usable time is spoken for these days. But I am getting through the work. Hopefully, the load will lighten a bit soon.

Ghostscript

I just sent a patch to code-review adding 0 and 45 degree angle cases to Well Tempered Screening. This is the last of the 8.0 release-critical features, so hopefully we'll be able to do a release soon. I expect to do a 7.32 beta tomorrow to let people experiment with WTS.

A nice bug image resulted from tonight's debugging. I don't really have an explanation for it. It's supposed to look like this.

Proofs

I think I can answer the question I posed last time about how to transform second order arithmetic with function applications into a "pure" axiomatization. I'll try to sketch it here.

Every production in the source language grammar has a translation into the target language. In particular, every (numeric) term in the source becomes a singleton set in the target. Constants and variables just become {x} or {c}. The interesting case is function application, which looks something like this ([term] means the translation of source-language term into the target language):

[f(a)] = {y | \exists x. x \in [a] ^ pair(x, y) \in [f]}

We can define pair(x, y) as (x + y) * (x + y) + x * x. There are other choices as well, but that's probably the simplest.

In order to translate proofs, you need to establish that every (translated) term really is a singleton, and that every function is total. I think these proofs can be generated straightforwardly, because their structure follows the parse of the source language expressions.

The above assumes we have a definitional framework for set comprehensions. I see two approaches. First, instead of having a comprehension axiom, you could just allow comprehensions to be written, as above, for set-valued terms. The fact that all terms are assumed to exist implies the usual comprehension axiom.

Another approach, more consistent with the usual theory, is to transform set-valued terms into "predicates with a hole" (here, [x := v] foo means the substitution of foo for x in foo; I'll gloss over the technicalities of substitution here, which can be quite tricky).

[{x | phi}] = [x := hole]phi
[a \in S] = \exists x. x \in [a] ^ [hole := x] [S]
[S0 = S1] = \forall x. [hole := x] [S0] <-> [hole := x] [S1]

I expect you rarely actually do the transformation into the target language; you'll usually do all the steps in the source language. But having the relationship formalized, I feel, is important. Among other things, it should let you freely intermix theorems over the same fundamental axioms, but with different definitional frameworks. This does not seem to be a feature of most proof systems, which in general treat definitions as new axioms, usually with some meta-level justification that they're conservative.

Why different definitional frameworks? Partial functions may be one good reason. In this world, terms are translated into sets which may be either a singleton or empty. Some things change; for example, \exists x. term is no longer a theorem, as it is in the total world. Even so, when translated into the target language, exactly the same things are provable. So, you should be able to do stuff like build a function in the partial world, prove that it's total, and then use it in the total world with little hassle.

Kids

We went trick-or-treating tonight, and it was fun - this year had a more social flavor than previous. We knew a bunch of the people we met, and really visited at two houses. Plus, none of the decorations or costumes scared either of the kids, which I think is a first.

Alan went as a lion-headed dragon, and Max as a "dino-store" (his "s" is still soft). Their cuteness was appreciated by all, and the candy haul was substantial.

Googlism

Raph Levien is generally quite pleased with his googlism results.

Thinkpad 600 battery

I sent a fairly strongly worded letter to IBM through their feedback mechanism. Here's my favorite excerpt:

Finally, I find this marketing statement galling: "Thanks to IBM's advanced power management systems, the ThinkPad's battery life has earned a powerhouse of a reputation." Perhaps this is true in the sense that Firestone tires on the Ford Pinto have earned a reputation with respect to safety, but I believe it is misleading and should be withdrawn.

blog, Advogato, mod_virgule

I'm really getting a lot out of writing this diary. It's becoming an absolutely integral part of my life. Here's an amusing anecdote: Heather learned that I had fixed the battery in her computer from reading this. My mom also follows my doings by reading this diary, which is important because I do a bad job keeping in touch through phone, mail, etc. (Hi Mom!)

I like the idea of slowly, gradually improving the mod_virgule code. There are a few things I consider important and want to add soon. First on the list is RSS export for diaries, because it seems like this is an increasingly important tool.

A lot of people keep both an Advogato diary and another, presumably for topics other than free software. This is fine (in fact, many bloggers seem to maintain multiple blogs), but I also want to make Advogato more useful for people who want to move in entirely.

So next is a <topic> tag for marking individual topics within an entry. This more-structural markup will be rendered in several ways: in RSS (and maybe OPML) export of diaries, as outlines when diary entries are rendered as summaries. Finally, I'll put in a scoring mechanism for custom views, so individual topics can be expanded, condensed, or ignored entirely. I'm also toying with ideas for the ontology of such topics: maybe you'll type, say, Books/Science Fiction/Stanislaw Lem/Solaris, and it will render as just Solaris, but have an effect for all the topic scoring filters. Not sure yet.

To people discussing an "ignore" patch: why not just set the diary ranking below 3? This feature really was intended to subsume filtering at the grain of individual accounts.

The "friends" view a la LiveJournal is quite a bit more interesting, especially as that's another aspect of the social network. Two-way links for references to other diary entries would, I think, be another important step in the right direction.

djm: you're quite welcome.

fxn: I haven't looked into it in detail, but it sounds like it's probably a bug in mod_virgule's xmlrpc code. See also this post by Rogers Cadenhead.

Quakers

One of the topics that I haven't been writing about much is my Quaker faith. When the <topic> mechanism gets implemented, I will feel more free to write about it without worrying that it will throw off the main focus of my diary.

At quarterly meeting, I participated in an interest group on electronic communications. Most of the discussion was about Meeting-sponsored web pages, which are of course an important tool for reaching out to people. The College Park one (that I just linked) is nicely done.

Are blogs also a useful tool for expressing Quaker values and faith? Probably so. A quick Google search doesn't turn up any from the unprogrammed, liberal tradition of my monthly meeting.

I stood up today and gave a message about what I'm finding when reading in the wider blog world about the Iraq conflict. I've found some insightful commentary, but very little of the writing from the liberal viewpoint shows deep critical thinking, and I find some of the more conservative writing deeply repellent. My quoting of Eric Raymond to the effect that our enemies are "to be dealt with as rabid dogs are" elicited an audible gasp, as well it should.

Proofs

I'm sure Heather and my mom will want to assign a negative score to <topic>Proofs</> :)

chalst: You say "moving between the two representations for Z_2 is pretty trivial (with the right definitional framework)." I take it this is the usual mathematician's meaning for "pretty trivial", which is roughly equivalent to "a small matter of programming".

In any case, arriving at the "right definitional framework" is exactly what I'm after. As you point out, pairing and representing functions as relations are easy. One of the trickier bits is showing that the axioms of reflexivity and transitivity for = work for terms including function application in addition to the primitive terms consisting of variables, 0, 1, +, and *. I'm not sure, but I think this needs to be proved as a metatheorem, which means effectively that it's hardwired into the proof system encoding Z_2.

Is there any place where the transformation between these two representations is written down explicitly? I get the impression that most mathematicians are content to show that you could do the transformation.

I'll reiterate my biggest open question. We agree that a Z_2 with functions as part of the definitional framework is a lot more usable than one expressed purely in the "predicates over naturals" axiomatization. Does there exist an even more usable definitional framework? How would you characterize this? Usability is a pretty slippery concept.

I don't have answers, but I'm starting to get a little bit of a feel for the terrain. I think a very important quantitative test for the expressive power of a proof system is the number of proof steps required to crank through an instruction in a simple CPU-like virtual machine. Over a pure axiomatization such as Z_2 or ZFC, my sense is that three orders of magnitude is about right.

If this expansion factor is suitably small, then a reasonable approach for some things is to run a program. Let's say we've already proved a theorem of the form, "for all runs of program P over input x, the output y satisfies such-and-such predicate(x, y)". Then, for specific input x0, mechanically prove that the output of program P is y0, use the modus ponens, and you've proved that such-and-such predicate(x0, y0) holds.

As I've mentioned before, I expect this kind of "proof strategy" to be useful for things like numeric calculation, parsing, etc.

I have more thoughts on this, but they're still a little unformed. A teaser: theorems of the form calc("2+2") = atoi("4").

Random thought

The raw capacity and power of personal computers has been growing at an amazing pace, over an equally impressive span of time. Even so, they're probably still considerably smaller than human brains. Mainstream estimates for the latter are 100 billion neurons, and an average of somewhere around 10,000 inedges for each neuron.

But you hook these personal computers up through the Internet, and all of a sudden the gap seems a lot smaller. If the Internet really wanted to become sentient, it probably could.

bytesplit

bytesplit: I mean you no ill will. However, by now it has become painfully clear that you and Advogato are not a good match for each other. I ask of you, please find another place for your writings.

Thinkpad 600 battery

Heather's got the 600 now. Yet another battery was tanking - the useful life was down to about 10 minutes. So I tried a technique suggested by Javier Valero, and so far it seems to work.

I've put up a page summarizing the ThinkPad 600 battery problem and possible solutions.

Proofs

I have lots more feedback than I can digest right now. But I will point out Robert Solovay's interesting reference to two flavors of second-order arithmetic. One is the "pure" Z_2 axiomatization with naturals, predicates over naturals (or functions from naturals to bool, if you prefer), and quantifiers over both. The other, with equivalent strength, has functions from naturals to naturals. The latter is probably a lot more directly useful for what I'm trying to do, because new functions can be defined as plain old lambda terms, and application is primitive, i.e. requires no hoops to be jumped through.

I'm more than a bit intrigued, though. Since these systems are indeed equivalent in logical strength, but differ widely in the convenience of their definitional systems, it's clear that choice of definitional framework is very important. How carefully has this space been mapped out, though? Is it possible that there are other definitional tricks that could make the systems even more usable, or is macro expansion plus function definition and application somehow complete in that all other definitional frameworks can be reduced to it tightly?

I'm not sure what such things might look like, but I see glimmerings. For one, I'd imagine that statements of the form "abstract tree S is a parse of string S by grammar G" might be easier to express and prove than in the usual logics. In any case, these statements are exactly the kind of thing I'd expect to be primitive in a human-friendly "source language", and then compiled automatically to a proof "object code", so maybe in the end it doesn't much matter.

Paul Snively has been blogging about proofs as well. I've skimmed his references, but so far am not sure how relevant they are.

I haven't felt much like interacting with anybody the past few days. It's probably the days getting shorter, and it will certainly pass.

289 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!