formalities and formalisms
one way of viewing programs is as "active" objects, about which
certian assertions may hold from time to time. in this view, people
say things like "what does this program do?" the program text
is viewed as suspicious, vulgar; to do "real" reasoning, you need to
relate the program text to something clean like mathematics or
formal logic where there are many good theorems (never mind the
ambiguity and duplication in their notations).
another way of viewing programs is to say that programs do
nothing; computers do things and programs are simply formal
suggestions to the computer. a program in a high level language is a
formal suggestion to another program (a compiler, expressed in a low-level language such as machine
code) about a sequence of i/o operations which may result in the
production of another program in a low level language. in all cases,
however, a program is a completely static entity; a single
formula with rules which may apply to describe legal transformations
of it.
in this latter view, the most important thing a program suggests is a
relationship between pre- and post- states of a computer. this "large"
relationship is stated as a sequence of instructions, but each
instruction suggests at a fundamental level nothing more than a
relationship of the pre- and post- states of a computer (including i/o
of course). the post-state of one instruction is the pre-state of the
next. this is a simplification of modern instruction pipelines but it
is often a sufficient level of detail for reasoning.
so suppose you write down an instruction using the "static" view of a
program. say:
1: x <- k * v
if instead you take the first view, you may then feel the need to
"annotate" this with the "formal" property that the post-state of the
computer contains a variable whose arithmetic value is the product of
two others:
// annotation: { post[1][x] = pre[1][k] * pre[1][v] }
1: x <- k * v
it may strike you as suspicious at this point that you've just written
down two nearly identical expressions, but recall that in this view
the program is an "active" object (subject to wild fits of
unpredictability) and the annotation spec, in some more "mathematical"
semantics, is the only "formal" object.
what this means, when you say that an expression is "formal", is that
the formal object is the only one you have confidence about
your reasoning about, and transformation and of. for example, say you
later have an instruction:
// annotation: { post[5][y] = pre[5][x] * pre[5][v] }
5: y <- x / v
you can then perform the annotation reasoning:
post[5][y] = pre[5][x] * pre[5][v]
= post[1][x] * pre[1][v]
= pre[1][k]
and then you can (somehow) relate this back to the formal text of the
program to see that it's legal to rewrite as:
5: y <- k
but notice that to perform this reasoning, you had to perform a "round
trip" in and out of the "formal" system. after all, you cannot reason
about filthy active objects like programs; they might do anything! and
of course in order to faithfully do the "translation" back and forth
between the "active" and "formal" parts of the program, you need a
formal relationship between your "formal" notation and the
program text. oops! you've just accidentally raised the program text
to the level of a "formal" system anyways!
this is the convoluted thinking (unfortunately common, in "formal
methods" circles) that Hehner sets out to change. he thinks that the
program is a formal expression of a relationship between pre-
and post- states of a computer, and if you tidy up common programming
languages you will find buried inside them a simple boolean expression
logic which uses the programming language as its fundamental
terms. otherwise you spend all your time translating back and forth
between the program text and your "real" formalism, and you get so
much clutter that any reasonable programmer gives up in disgust.
lest you think this is completely implausible, have a look at ACL2.
it is a fully functioning, quite usable system which performs exactly
this sort of "reasoning about the program text" for a fully executable
lisp language.