Thanks to raph, Michael Norrish was able to get in touch with me. Here is his message (it took me rather a long time to post this, but better late than never) which clarifies the status of HOL:

Dear Charles,I got to your Advogato diary entries through those of Raph Levien's. Just a few things about HOL:

Regards,

- It's not based on System F, but a polymorphic version of Church's simple type theory, making its type system even simpler than that of ML. This also means that it's easy to give it a set theoretic semantics.
- HOL is a direct descendent of LCF, the project that invented ML. HOL's first developer was Mike Gordon was one of the people that worked on LCF with Robin Milner in Edinburgh and Stanford. LF is later, mainly unrelated, project.
Michael.

raph: I wish it was normal for most university professors to apply the same intellectual energy and curiosity that you do to fields outside their main area of enquiry... To answer the questions you ask:

- A theory of functions doesn't have a consequence relation, but
Goedel showed, in his Dialectica paper, how it is possible for a
calculus of functions to have the same mathematical strength as a proof
system. This technique is called functional interpretation, and it is
a very important technique in proof theory. Under this system, the
pure predicate calculus (ie. FOL) can be captured by the simply-typed
lambda calculus, Peano Arithmetic by the extension of the simply-typed
lambda calculus by an iterator, and Z_2 by the polymorphic lambda
calculus.
Your question was: how is it that the polymorphic lambda calculus (ie. System F) can have the same strength as Z_2? It's hard to explain without reference to Girard's proof of strong-normalisation for system F, but the essential idea is that (i) the impredicative quantification over types possessed by system F is powerful because a lambda-term possessing this form of type abstraction is abstracted over its *own* type; (ii) this quantification doesn't introduce vicious circles because we can stratify the terms of the calculus by their types, assigning a rank to them based on how deeply their most deeply nested type quantifier occurs, and the type system forbids application of terms of given rank to terms of equal or higher rank. In this scheme applications of Pi^1_k comprehension in Z_2 correspond to terms of rank k. This is a difficult idea: if you really want to understand it you should read Girard's book.

- In describing the above I avoided the details about classical vs. intuitionistic systems. With classical proof systems, the functional interpretation does not proceed as directly as for intuitionistic systems, and we need to do a bit of meta-mathematical massaging to get our correspondence. The intuitionistic logics, such as the calculus of constructions that lies behind Coq, that are in direct correspondence with lambda calculi of system F-like strength obviously do not have the principle of the excluded middle; rather less obviously it is not straightforward to add the principle without making the calculus inconsistent. So in fact classical vs. intuitionistic turns out to be an important distinction. Total vs. partial functions I think is not so important, since a good definitional scheme will allow you to move between the two representations.