Older blog entries for pfdietz (starting at number 4)

Hmm. Where does the time go?

The ANSI CL test suite is now at over 18,000 tests. Right now I'm testing interactions between various kinds of unusual vectors and the builtin string/sequence functions. This is mostly to support crhodes in his effort to add unicode support to SBCL, but it's already found at least one five year old bug in CLISP (REVERSE didn't work on displaced vectors).

After that I'm going back to finishing the FORMAT tests. I'm still aiming for about 20K tests for the 'finished' test suite.

Common Lisp testing is coming along nicely. The gcl ansi test suite is now over 14,000 tests. I've most recently nearly finished testing chapter 12 of the Common Lisp Specification, 'Numbers'.

To go beyond simple 'does function FOO do X when called on arguments Y' kinds of tests, I've also started experimenting with a random tester. I've taken the approach of William McKeeman (favorite quote: "Excellent testing can make you unpopular with almost everyone.") I generate random lisp forms that are guaranteed to be well-defined when a set of free variables have integer values. The forms are wrapped in two lambda expressions, one with type declarations for the free variables and one in which optimizations are inhibited. The functions are compiled and their outputs compared on a set of random inputs.

This approach is very simpleminded (and a lot easier to do in Lisp than in C, as McKeeman did), but it immediately found new bugs in every lisp implementation on which I tried it. The beauty of this approach is that it exploits increasingly cheap CPU cycles to save the increasingly relatively expensive programmer/tester cycles.

BTW, DEC (now HP) has a patent on this method of compiler testing. However, the patent assumes the compiler is a separate program, so lisp is apparently off the hook (but IANAL).

5 Jun 2003 (updated 6 Jun 2003 at 00:45 UTC) »

I've concluded the Dylan superclass linearization algorithm is preferable to the Common Lisp (CLOS) algorithm.

A refresher: in a hierarchy with multiple inheritance, some means must be found to determine which superclass a class should inherit entities from. One approach is to have an algorithm for totally ordering (linearizing) the superclasses ('computing the class precedence list (CPL)', as one might say in CL), and inheriting each entity from the first superclass in the list in which the entity occurs. Algorithms are allowed to reject certain hierarchies if they cannot be linearized in some acceptable way.

The CL and Dylan algorithms are similar, except that the Dylan algorithm is slightly more strict in the hierarchies it accepts. In return Dylan's orderings have the property of monotonicity: if C1 is a subclass of C2, then the linearized superclasses of C2 occur in the same order in the CPL of C1.

Why do I care about this? I was writing CL compliance tests for the class precedence orderings specified in the ANSI CL spec. To do this, I wrote a generic function with the LIST method combination, and added a method for each standardized class that return that class's name. This should yield the names of the standardized superclasses of the class of a value, in the order they appear in the CPL.

BUT... there's in general no guarantee that if some object is in type C (C a class) that it's also an instance of C. It could be an instance of a subclass of C, and that subclass could have a different CPL. Lack of monotonicity means I cannot presume the superclasses of C still occur in the specified order.

This reconfirms my belief that writing a test suite tests not only an implementation, but also a specification. Untestable requirements are bugs.

cmm: yes, that's the algorithm.

30 May 2003 (updated 30 May 2003 at 11:47 UTC) »

In ANSI CL, the defgeneric function takes a :generic-function-class option that should be 'the name of a class that can be the class of a generic function.' Similarly, the :method-class option should be 'the name of a class that is capable of being the class of a method.'

However, nowhere in the standard (AFAICT) is any standard way to create such classes specified. I get the feeling the metaobject protocol was amputated here, and these are some of the scars.

I've worked through about 2/3 of the CLOS section of the spec, just finishing writing tests for the builtin method combinations. After this section I think I'll write some tests of just when various def* forms have their effects during compilation and loading, since this seems to me to be a fertile area of bugs in CL implementations and programs.

I've been wondering if Lisp without cons cells, or at least with cons cells deemphasized, would work better than Common Lisp. Vectors work better than linked lists with architectures like ia64, have better cache locality, and admit more parallelism in loads (increasingly important as memory bandwidth grows faster than memory latency shrinks.) Algorithms on vectors are often asymptotically faster than algorithms on lists; as computers become faster and the problems addressed become larger this becomes more, not less, important.

Vectors could be allocated with some extra unused space after them. This way, they could be expanded (by a factor of 2, say) without having to be moved. With this, vectors could be incrementally expanded at the tail in constant amortized time per operation. The distinctive features of Lisp, such as macros, could survive mostly unchanged.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!