Older blog entries for crhodes (starting at number 88)

OK, so what were the weasel words in my last entry about Norvig's Prolog implementation? Well, Prolog has a number of control constructs, which look like functors except that their semantics, in one important respect, are not as regular.

(Aside: in many ways, these control constructs of Prolog are similar to Lisp's concept of a special operator; neither Prolog nor Lisp provides for a way to extend the initial set of predefined magical constructs, but Lisp's macros provide sufficient control over evaluation semantics that things which look like special operators can be defined, while I'm pretty sure that unification in Prolog allows for similar implementation of new control constructs.)

Now, the issue with these special operators, and in particular the ->/2 and ->/2 ;/2 ‘if-then’ and ‘if-then-else’[*] control constructs have particular semantics in conjunction with the cut, !/0: the then and then-else portions of those operators are transparent to the cut, meaning that backtracking is inhibited from the predicate which is defined, rather than any notional -> predicate.

This means that any definition of -> as a predicate is doomed to fail. Norvig effectively defines if/2 (corresponding to ->/2 as

(<- (if ?test ?then) (call ?test) (call ?then))
which has the right semantics except when one of ?test or ?then contains a cut.

I have not yet fixed this is my local extension of Norvig's Prolog; I'm pretty sure I know how to do it (my experience of Lisp compiler technology is enough to tell me how to implement special forms; essentially, what needs to be done is to teach the compiler about if-then and if-then-else, and then implement call/1 by requiring it to compile a dummy uninterned predicate), but I've been occupied with parsing ISO prolog instead. One thing which is demotivating me from making this fix is the somewhat lamentable state of the test suites available for ISO Prolog: one from the committee and one from INRIA, neither of which tests any of the difficult cases from the standard. Now I realise how spoilt we in the Common Lisp community are by pfdietz and his test suite.

In case anyone is wondering what the cut is: passing a cut during the execution of a predicate prevents backtracking past that point: any attempt to backtrack will simply cause failure. For more details, see your nearest Prolog tutorial, because my limited experience prevents a good didactic presentation.

[*] these are in fact very bad names.

I've been having adventures in Prolog recently. Fear not, I haven't abandoned Lisp for all that! However, Real Work has its own requirements, and one such was that I get up to speed in Prolog rather rapidly.

So, learn by implementing. Well, Peter Norvig, in his book Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp (buy it, borrow it: run, don't walk) implements a fair proportion of a Prolog engine, including a full (or almost: more on this another day) compiler. Reading the book, and the accompanying code, was both entertaining and educational, and to familiarize myself with bits of Prolog that weren't covered by the book, I found a draft copy of the ISO Prolog standard, and started implementing missing functors.

However, some of the Prolog code I'm working with (ab)uses operator notation: it depends on being able to talk to Matlab and PostgreSQL, and uses operators to indicate with minimal character syntax how to interpret Prolog terms in these contexts (is it a string, a cell array, an mxArray?)

So I also needed to understand how Prolog is parsed. As it happens, I've been looking at the Climacs text editor project for a while; it's been growing a parser framework based on a slight modification of Earley's efficient algorithm (ACM tax required). Armed with this, and liberal amounts of debugging code and beer, yesterday I got it to the point when it could demonstrably parse Prolog text at least semi-reliably. It's not blindingly fast at the moment; removing the debugging code should help, as should hooking into the Climacs framework which allows reparsing only regions which both have changed and are displayed on the screen.

Once the editor support is acceptable, it should be relatively straightforward to finish off the underlying Prolog implementation (I need a parser for read_term/2 and friends); whether anyone else will ever find this useful I don't know, but having a lisp/1 functor as the Prolog's foreign function interface has a certain coolness factor...

What's new? Not much, I suppose. I have been thinking and working on data structures for indexing multidimensional data; mostly implementing them for now (release pending a refactor and one more tree implementation), but with a view to doing actual research when my work computer finally arrives. On which subject, I've also been following Juho Snellman's astonishingly rapid progress through making SBCL work natively on the AMD64 – I look forward to him fixing all remaining problems in the next two days.

I could mention my amusing* experiences with a letting agency, but I suppose there's an outside chance there could be people with nothing to do all day and internet access in there, so maybe I'll wait until after I have the keys in my grubby little hand.

* not really.

Happy Democracy Day.

My contribution to the American understanding of foreign cultures on this most important of days is the final merge of the SBCL development that I've been doing on character_branch onto CVS HEAD. The technical details in the development aren't particularly interesting, I think, except that I am pleased that I managed to arrange it such that the system can be configured with full Unicode support or not as a build-time option. There are still things that are broken, but, well, I'm too tired to fix them right now. Here's hoping that some of the other 13 SBCL developers can pick up some of the slack...

What might be of interest to more than the minuscule (or, more charitably, ‘embryonic’) Common Lisp community, though, was the management of the development branch and the merge. The branch underwent some 46 distinct CVS revisions, and one factor that I believe helped the management of a spare-time project like this was the discipline of not checking in manifestly broken code. A temptation when working on invasive changes such as these (diffstat says 148 files changed, 2965 insertions(+), 1786 deletions(-)) is to check in ‘obviously’ right code that happens to be untestable. Don't do that. The reason this discipline helped was that the merge to HEAD happened in a completely different order from the development on the branch: development is assisted by working on one feature at a time and testing the new feature for being broken, whereas the merge – which will preserve both codepaths to an extent – needs to absorb things which are common to both first, exposing the true differences between the systems. To put some more numbers on that, only approximately 4% (40kB from a patch of over 1MB) of the patch ended up being conditionally compiled: the rest was common to both build options.

The moral? Well, it won't be surprising to too many people reading this, I suppose, but: “Keep the system working as much as possible through development as well as in public.” Enjoy the new expressivity.

chalst says
For what it's worth, an outsiders view of what Common LISP needs[...]

The problem with this is that one thing that's worse than an insider's view of what Common Lisp needs is an outsider's view.

Why should that be? Well, the goal of evolving Common Lisp isn't, or shouldn't be, to turn it into something which looks and quacks just like the fashionable duck of the day. So a Scheme adept might bemoan the lack of full multi-shot re-entrant continuations, while ignoring (or being ignorant of) the fact that requiring such effectively kills off any hope for compilation to efficient code. An ML wizard might want greater support for typechecking; a C++ expert might hope for greater templating. All of these are solutions looking for problems in the context of Common Lisp: the lack of multi-shot re-entrant continuations, templates or statically-typed Lisps is not stopping people from getting work done, and there is no demand from people currently using Common Lisp for these features.

Would adding these features instead encourage people from other communities to leap on the Common Lisp bandwagon? I think that's unlikely: why accept Common Lisp with Schemey bits tacked on if you already have Scheme? Only if CL provides you with things that Scheme does not and cannot will that work: and let us not forget that implementing these abstractly-requested features would take time, and lots of it — probably enough for R6RS to appear.

In one respect, in any case, comparison with SRFIs is misleading: Scheme needed the SRFIs to allow people to decide on a shared vocabulary for completely ordinary programs, for the utility functions that previously everyone was having to implement for themselves: it's not hard to write remove-if-not, but if everyone has to do it then that's so much less time to spend on solving the actual problem. In other respects, of course, the SRFIs are doing useful work, but a brief perusal seems to indicate that most of the implemented SRFIs are essentially replicating the standard Common Lisp “library” (for want of a better word).

So what should CLRFIs be doing, then? Well, if I had infinite free time I might write an external-format CLRFI, to harmonize treatment of different character sets when talking to the outside world. A networking CLRFI might be nice, but I think it's probably still not clear how to make it sufficiently platform-independent as to allow natural implementation everywhere CL already exists; it might be better to attempt to standardize higher-level networking (at the protocol level) and allow implementations to use whichever socket layer is to hand. I've already discussed chalst's desire for continuations: there are portable libraries for achieving single-shot continuations, or macros enabling a natural expression of continuation-passing style, but requiring implementations to provide native full first-class continuations is a non-starter. Hygenic macros in Common Lisp would be a solution to a non-problem: given the years of experience in programming in the language (not my years alone: other people's years too) it is clear that the lack of hygiene in the macro system is simply not an issue to the Lisp community. As for #! support, it's really not clear to me why that should need to be available in a standard fashion: given that it's clearly a deployment rather than a development issue, by that stage an implementation choice will have been made, and one can simply use that implementation's facility.

That leaves, from chalst's wishlist, an FFI (where the "F" there actually means "C") standard. And here there is definitely a case for wanting firmer ground to stand on: people using the language at this very moment would prefer the various Lisp vendors to get over their issues with each other and just coöperate. The good news is that it's not all bad news: UFFI is a “user-space” solution to some of those needs (though not all, because by its nature it supports the intersection of available features). A good first step might be a survey of the different implementation's provisions with a view to establishing a reasonable union of features to request from the vendors, and then applying pressure or submitting patches. (Which reminds me: when I have enough free time, unless someone does it before me, I must look at the user-contributed suppport for callbacks on the PPC... time is the key).

Here's a less silly screenshot reflecting the progress of SBCL's upcoming Unicode support. A limited number of external formats have been implemented: enough to act as proof of concept, as demonstrated there. The TODO file is beginning to read more like notes of the process we went through rather than the mountain that has yet to be climbed, which is definitely cheering.

In other news, I went to Amsterdam last weekend for a local lispmeet (well, it was flagged as benelux, but when you're getting people from the UK and Sweden, it's looking more than just that.) There was much of interest, from Peter van Eynde's stealth introduction of Lisp resulting in him being “forced” to do Unix systems engineering in his favourite language, through Edi Weitz' rdnzl .net/Lisp bridge, to the ever-present question of evolution of Lisp: this time Nick Levine represented the clrfi-editors (those who know Scheme will spot the similarities with the SRFIs) with a plea for a high-quality submission to see if their process could be unblocked.

On this last, my view is that the Lisp community, whatever that is, certainly needs a process to evolve the language (preferably backwards-compatibly, but some things might need to be removed root-and-branch), it also needs people to do the work to produce high-quality standardization material: and it's more the lack of that that's lacking, and no amount of process will generate it. I'd love to be proved wrong.

SBCL Unicode work is progressing well. It's nearly at the stage where it becomes useful: it's already at the stage where we can do silly demos such as in this screenshot. Clearly, this isn't going to make it into sbcl 0.8.15; however, I have hopes that it will into 0.8.16 (or should that be 0.9.0 or even 0.1point0.pre1? We still have a little work to do before everybody's happy with supporting a 1.0 release: the issue isn't so much the stability of the code (which was engineered to last — parts of it are decades old!) as the desirability of supporting a suboptimal interface as stable; the main benefit from a developer's point of view of a pre-1.0 release is the ability to laugh away complaints about lack of backwards compatibility. Such days may soon be behind us.

We live in interesting times. SBCL has acquired a new web page layout (with some extra content, too!); Nathan has almost completed the 64-bit work which I dropped after submitting my thesis; Nikodemus, when he returns from studying hard, is likely to give us linkage tables (you can solve everything in Computer Science with an extra layer of indirection) and a single stepper; and I, along with some friendly helpers (and armed with four years of discussions on #lisp IRC), have started the slow process of making SBCL Unicode-aware.

With luck, people will use these exciting new features. It's sometimes hard to tell whether one is programming into a vacuum: I count myself lucky that I can work on SBCL projects for intellectual satisfaction, though I confess that I do prefer it when there is visible positive feedback from actual users...

Edinburgh was fun. Hard work, but fun. Singing with a nice bunch of people is always a good thing, and when we're doing such challenging masterworks as Tallis' Spem in alium (where everyone has to pull their weight: it's for eight choirs of five parts each...) it's good for the brain cells too: certainly better for them than the post-concert social occasions, aided and abetted by Scotland's sane licensing laws — unlike in England, alcohol may be served after 11:00pm.

Juho Snellman is a smart cookie.

SBCL is a large, complex beast: I've talked enough about how hard it is for a system to give the right answer, but when people start wanting it to give the right answer quickly, well... Juho Snellman has single-handedly found improvements to string hashing routines, to poor scaling behaviour with multiple threads, and now bignum division. It appears not to solve the performance problems that we're having with GSharp (which are probably algorithmic in nature: why are we doing arbitrary-precision calculations, anyway), but I should hope that some of the graphs in Andreas Fuchs' boinkmarks (once the nightly run has completed) will look interesting.

79 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!