11 Aug 2002 graydon   » (Master)

formalities and formalisms

one way of viewing programs is as "active" objects, about which certian assertions may hold from time to time. in this view, people say things like "what does this program do?" the program text is viewed as suspicious, vulgar; to do "real" reasoning, you need to relate the program text to something clean like mathematics or formal logic where there are many good theorems (never mind the ambiguity and duplication in their notations).

another way of viewing programs is to say that programs do nothing; computers do things and programs are simply formal suggestions to the computer. a program in a high level language is a formal suggestion to another program (a compiler, expressed in a low-level language such as machine code) about a sequence of i/o operations which may result in the production of another program in a low level language. in all cases, however, a program is a completely static entity; a single formula with rules which may apply to describe legal transformations of it.

in this latter view, the most important thing a program suggests is a relationship between pre- and post- states of a computer. this "large" relationship is stated as a sequence of instructions, but each instruction suggests at a fundamental level nothing more than a relationship of the pre- and post- states of a computer (including i/o of course). the post-state of one instruction is the pre-state of the next. this is a simplification of modern instruction pipelines but it is often a sufficient level of detail for reasoning.

so suppose you write down an instruction using the "static" view of a program. say:

1:  x <- k * v

if instead you take the first view, you may then feel the need to "annotate" this with the "formal" property that the post-state of the computer contains a variable whose arithmetic value is the product of two others:

// annotation: { post[1][x] = pre[1][k] * pre[1][v] }
1:  x <- k * v

it may strike you as suspicious at this point that you've just written down two nearly identical expressions, but recall that in this view the program is an "active" object (subject to wild fits of unpredictability) and the annotation spec, in some more "mathematical" semantics, is the only "formal" object.

what this means, when you say that an expression is "formal", is that the formal object is the only one you have confidence about your reasoning about, and transformation and of. for example, say you later have an instruction:

// annotation: { post[5][y] = pre[5][x] * pre[5][v] }
5:  y <- x / v

you can then perform the annotation reasoning:

post[5][y] = pre[5][x] * pre[5][v]
           = post[1][x] * pre[1][v]
           = pre[1][k]

and then you can (somehow) relate this back to the formal text of the program to see that it's legal to rewrite as:

5:  y <- k

but notice that to perform this reasoning, you had to perform a "round trip" in and out of the "formal" system. after all, you cannot reason about filthy active objects like programs; they might do anything! and of course in order to faithfully do the "translation" back and forth between the "active" and "formal" parts of the program, you need a formal relationship between your "formal" notation and the program text. oops! you've just accidentally raised the program text to the level of a "formal" system anyways!

this is the convoluted thinking (unfortunately common, in "formal methods" circles) that Hehner sets out to change. he thinks that the program is a formal expression of a relationship between pre- and post- states of a computer, and if you tidy up common programming languages you will find buried inside them a simple boolean expression logic which uses the programming language as its fundamental terms. otherwise you spend all your time translating back and forth between the program text and your "real" formalism, and you get so much clutter that any reasonable programmer gives up in disgust.

lest you think this is completely implausible, have a look at ACL2. it is a fully functioning, quite usable system which performs exactly this sort of "reasoning about the program text" for a fully executable lisp language.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!