I've recently been working on improving SBCL's build behaviour.
Well, perhaps that's not exactly a novel or surprising piece of information for this diary: but maybe this is the last major effort that has to be made. One of SBCL's raisons d'être is to be a Common Lisp compiler, written in Common Lisp, that does not have any dependencies on the host Lisp compiler used to build it (assuming that the host implementation is sufficiently conforming, but not making any unwarranted assumptions about any implementation-defined or undefined behaviour). This method for building a Lisp compiler is sufficiently tricky to wrap one's head around that I wrote and delivered a paper describing it, including diagrams, for the Workshop on Self-Sustaining Systems last year.
So what's new now? Well, given this aim, it would be reasonable for the output of the SBCL cross-compiler, running as an application in the host Lisp, to produce bitwise-identical output files independent of which host Lisp was being used. Reasonable, perhaps, but in practice the state of SBCL until a month or so ago was quite a long way away from this ideal: every output file had a header declaring when and from which compiler it had been generated, which was not going to help bitwise comparisons: and even after such straightforward issues had been dealt with, every single output file from the build process (of which there are of the order of 350) exhibited non-trivial differences when compiled with CLISP from the corresponding output files compiled with SBCL.
The differences between output files that I've observed and fixed in the last month or so – by lavish use of cmp(1), staring at emacs buffers consisting largely of control characters, and the time-old debugging method of Thinking Very Hard – can be broadly split into three categories:
- outright leaks: information being taken from the
host
compiler and erroneously treated as though it was applicable
to the
target. Sometimes this was simple carelessness (the
cross-compiler's
constant-folding for symbols used symbol-value,
even
for things like most-positive-fixnum);
sometimes it was nastier than that (CLISP and SBCL disagree
on the
value of (log 2d0
10d0)
in the last binary digit). It's not nice to find these
things still
showing up, but at least this set of changes should make
these bugs
stick out like a sore thumb if they're ever reintroduced (or
still
present on other platforms...)
- traversal order: some operations have a net
effect that's
deterministic, but the order in which sub-operations was
different;
collecting a bunch of definitions by iterating over a hash
table with
maphash,
or a set of
storage classes with union, is going
to be
correct no matter the order, but different between different
implementations of those functions. Also in this class,
though not
strictly speaking an issue of traversal order, was the
dependence on
the host's interpretation strategy: whether forms were
always compiled
or not, which influences the number of times gensym is
called and
hence the value of *gensym-counter*.
- idiosyncrasies: working towards eliminating the
effect of
differences of interpretation in the standard. Most of these
differences of opinion are legitimate: CLISP macroexpands
more than
SBCL does because it has an interpreter on by default, for
instance;
similarly, CLISP's compiler doesn't aggressively coalesce
constants,
whereas SBCL's does. Some of them are less so; CLISP, for
instance,
prints 'foo as "'FOO" whereas SBCL prints
"(QUOTE FOO)" when *print-pretty*
is nil. I think the standard mandates SBCL's
behaviour, but for these purposes we have to work
around it
anyway.
In the process, I also fixed a terribly embarrassing bug in genesis, the Lisp application responsible for taking these output files from the cross-compiler and constructing a Lisp memory image file ready to start up. As has been discussed here in the past, the standard-mandated requirement on arrays in Common Lisp is not sufficient for us to be able to construct the image file in memory as an array of bytes: conforming implementations are permitted to impose a maximum array size as low as 1024 elements. Fortunately it is straightforward to implement a suitable data structure without this constraint; unfortunately, the implementation did not take sufficient care to zero-fill newly-allocated memory, and while most of the time Lisp implementations perform that zeroing, there are circumstances in which they don't.
All of this has now been merged to SBCL's CVS, and is awaiting the 1.0.28 release due on Thursday 30th April. The practical upshot? Well, apart from a certain amount of increased confidence in the implementation strategy, and perhaps soothing the nerves of extremely paranoid distribution package maintainers, these changes should make compilation of SBCL on new platforms more straightforward, as there are now two implementations capable of building SBCL which are themselves buildable starting just from gcc and system libraries: CLISP and Peter Graves' XCL.