30 Nov 2002 raph   » (Master)

Proofs, or: Set theory for dummies

I think I've worked out a simple, yet powerful, definitional framework for Metamath. I had the basic idea a few weeks ago, but ran into a technical problem with capture of unbound variables (not surprising that this is where the pitfalls lay!). I think I've solved it now.

I'm trying to do these definitions by textual expansion. In Metamath proper, the usual way to create a defintion is to introduce a new axiom. For example, here's the definition of ordered pairs, following Kuratowski:

df-op $a |- <. A , B >. = { { A } , { A , B } } $.

There's no syntactic difference between a new definition and a fundamental axiom such as ax-inf (the Axiom of Infinity). However, definitions are generally of a form which makes it clear they're not adding any axiomatic power.

Even so, I think it's important to formally distinguish axioms from definitions, especially for Web-based proof checking. If you download a random proof off the Internet, you really want to know that the definitions are "safe" in some formal sense.

I'm going to gloss over syntactic issues a bit. In Metamath, definitions are intertwingled with the grammar for parsing terms. In general, a new definition also extends the grammar. The main problem with this is that two grammar extensions, while themselves unambiguous, may lead to ambiguity when combined. Also, it's very important that the textual expansion matches the parse. In C, it's possible to violate this with amusing results:

#define square(x) (x * x)
int main() {
  printf ("The square of 1 + 1 is %d.\n", square(1 + 1));

So I'm basically going to assume that we're working with terms in a pre-parsed format (like Lisp S-expressions).

Ok, so here's how it works. A definition looks like this:

double ( A ) == ( A + A )
The first token is the thing being defined. There's only one definition for each token. If you try to define the same token twice, it's an error.

Next comes a list of variables. Following the Metamath convention, there's a previous declaration of the "type" of each variable. In set.mm, A is of type "class", which is the right one for numbers.

The == symbol means it's a definition. I chose it to avoid conflict with =, which is plain ol' equality.

Finally, there's the expansion of the definition, which is pretty much as expected. One caveat: it can only refer to definitions that have come previously; circular definitions are not allowed.

Ok, now the rule for expanding. For this particular definition, you can safely expand it anywhere it occurs, but that won't be true in general. So we restrict expansion to a single, especially safe place: between an assertion and its proof. If the final step of the proof exactly matches the expanded version of the assertion, we accept (the unexpanded version of) the assertion as proved.

For example, let's say we wanted to prove the assertion: double ( 2 ) = 4. Its expansion is ( 2 + 2 ) = 4. As it happens, this is already proved in set.mm, so the body of the proof can be just 2p2e4. Since the result matches the expansion, we accept the orignal assertion as proved.

Now for the tricky bit. The right-hand-side of a definition may contain dummy variables. An example is: V == { x | x = x }. (This is the "universal class" containing all sets). In this particular case, introducing the dummy variable in an expansion is safe, but that's only because the variable x is bound. Metamath doesn't make a formal distinction between free and bound variables, largely because it's pretty tricky to get right. Instead, it uses "distinct variables".

We add two rules for dummy variables: the new dummy variables must not appear in the original, unexpanded assertion; and, if a definition is expanded more than once within a single assertion, all the dummy variables must be the same.

Let's say you had the degenerate definition: bad == x. If you let the dummy variable x appear in the original assertion, you'd be able to prove junk such as both "A. x x = bad" and its negation. The flip side is that it's no longer trivial to prove things such as: V = { x | x = x}. The body of the proof has to establish { x | x = x } = { y | y = y}. Fortunately, cbvabv takes the main burden of such a proof.

The second rule is more subtle. If you let them sometimes be the same, sometimes different, you could arrive at a contradiction, so it clearly has to be one or the other. I think it's safe to make them all distinct (and it seems more intuitive), but making them all the same preserves truth better. Take "bad = bad" for example. This is trivially provable (using cleqid) without doing the expansion, but if all dummies needed to be distinct, the expansion would be "x = y", which is not provable. I prefer the provability of terms containing definitions to be exactly the same as their fully expanded counterparts.

Apologies for the somewhat detailed technical nature of this post. I do believe, though, that this kind of definitional framework is much better suited to proof interchange on the Web than Metamath proper.


We had a nice Thanksgiving at Heather's dad and with her sister.

The day before, Heather pointed out the spectacular colors of the sunset to Alan, and he remarked, "that looks just like a mandrill's butt," completely deadpan.


The fine folks at Azoogle have figured out how to "publish" email past my Spamassassin setup. In addition, to quote from their web page, "By securing contracts with each of these major email companies [America Online, Yahoo, Hotmail, MSN] it guarantees your emails will get to your customers without being blocked, rerouted or suppressed."

They claim that their service is opt-in only. The major mail providers probably take them at their word. A test account at Yahoo doesn't currently show any from Azoogle, but plenty from, for example, email-publisher.com (which is actually Topica). This is the stuff that gets past their laughable Spamguard(TM), mind you.

This all suggests quite strongly that digital identity alone probably won't make that big a dent into spam, contrary to Jon Udell's assertion. Some spammers might have trouble getting a digital id from whoever's issuing them, but my guess is that the likes of Azoogle will have no trouble at all.

I also got an interesting 419 mail from one "Ade Blessing". This one was full of Jesus-speak, and mentions early on that the author was a Moslem before he converted to Christianity. I hate spam, but have mixed feelings about the 419 stuff. The victims, after all, have to be pretty dishonest to fall for it, and it's a net transfer of money from a rich country to a poorer one, much of which actually subsidizes Internet access for ordinary citizens. In this particular case, the victims are most likely anti-Islam religious bigots as well.

Spam only exists because it works. Sigh.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!