Older blog entries for crhodes (starting at number 87)

I've been having adventures in Prolog recently. Fear not, I haven't abandoned Lisp for all that! However, Real Work has its own requirements, and one such was that I get up to speed in Prolog rather rapidly.

So, learn by implementing. Well, Peter Norvig, in his book Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp (buy it, borrow it: run, don't walk) implements a fair proportion of a Prolog engine, including a full (or almost: more on this another day) compiler. Reading the book, and the accompanying code, was both entertaining and educational, and to familiarize myself with bits of Prolog that weren't covered by the book, I found a draft copy of the ISO Prolog standard, and started implementing missing functors.

However, some of the Prolog code I'm working with (ab)uses operator notation: it depends on being able to talk to Matlab and PostgreSQL, and uses operators to indicate with minimal character syntax how to interpret Prolog terms in these contexts (is it a string, a cell array, an mxArray?)

So I also needed to understand how Prolog is parsed. As it happens, I've been looking at the Climacs text editor project for a while; it's been growing a parser framework based on a slight modification of Earley's efficient algorithm (ACM tax required). Armed with this, and liberal amounts of debugging code and beer, yesterday I got it to the point when it could demonstrably parse Prolog text at least semi-reliably. It's not blindingly fast at the moment; removing the debugging code should help, as should hooking into the Climacs framework which allows reparsing only regions which both have changed and are displayed on the screen.

Once the editor support is acceptable, it should be relatively straightforward to finish off the underlying Prolog implementation (I need a parser for read_term/2 and friends); whether anyone else will ever find this useful I don't know, but having a lisp/1 functor as the Prolog's foreign function interface has a certain coolness factor...

What's new? Not much, I suppose. I have been thinking and working on data structures for indexing multidimensional data; mostly implementing them for now (release pending a refactor and one more tree implementation), but with a view to doing actual research when my work computer finally arrives. On which subject, I've also been following Juho Snellman's astonishingly rapid progress through making SBCL work natively on the AMD64 – I look forward to him fixing all remaining problems in the next two days.

I could mention my amusing* experiences with a letting agency, but I suppose there's an outside chance there could be people with nothing to do all day and internet access in there, so maybe I'll wait until after I have the keys in my grubby little hand.

* not really.

Happy Democracy Day.

My contribution to the American understanding of foreign cultures on this most important of days is the final merge of the SBCL development that I've been doing on character_branch onto CVS HEAD. The technical details in the development aren't particularly interesting, I think, except that I am pleased that I managed to arrange it such that the system can be configured with full Unicode support or not as a build-time option. There are still things that are broken, but, well, I'm too tired to fix them right now. Here's hoping that some of the other 13 SBCL developers can pick up some of the slack...

What might be of interest to more than the minuscule (or, more charitably, ‘embryonic’) Common Lisp community, though, was the management of the development branch and the merge. The branch underwent some 46 distinct CVS revisions, and one factor that I believe helped the management of a spare-time project like this was the discipline of not checking in manifestly broken code. A temptation when working on invasive changes such as these (diffstat says 148 files changed, 2965 insertions(+), 1786 deletions(-)) is to check in ‘obviously’ right code that happens to be untestable. Don't do that. The reason this discipline helped was that the merge to HEAD happened in a completely different order from the development on the branch: development is assisted by working on one feature at a time and testing the new feature for being broken, whereas the merge – which will preserve both codepaths to an extent – needs to absorb things which are common to both first, exposing the true differences between the systems. To put some more numbers on that, only approximately 4% (40kB from a patch of over 1MB) of the patch ended up being conditionally compiled: the rest was common to both build options.

The moral? Well, it won't be surprising to too many people reading this, I suppose, but: “Keep the system working as much as possible through development as well as in public.” Enjoy the new expressivity.

chalst says
For what it's worth, an outsiders view of what Common LISP needs[...]

The problem with this is that one thing that's worse than an insider's view of what Common Lisp needs is an outsider's view.

Why should that be? Well, the goal of evolving Common Lisp isn't, or shouldn't be, to turn it into something which looks and quacks just like the fashionable duck of the day. So a Scheme adept might bemoan the lack of full multi-shot re-entrant continuations, while ignoring (or being ignorant of) the fact that requiring such effectively kills off any hope for compilation to efficient code. An ML wizard might want greater support for typechecking; a C++ expert might hope for greater templating. All of these are solutions looking for problems in the context of Common Lisp: the lack of multi-shot re-entrant continuations, templates or statically-typed Lisps is not stopping people from getting work done, and there is no demand from people currently using Common Lisp for these features.

Would adding these features instead encourage people from other communities to leap on the Common Lisp bandwagon? I think that's unlikely: why accept Common Lisp with Schemey bits tacked on if you already have Scheme? Only if CL provides you with things that Scheme does not and cannot will that work: and let us not forget that implementing these abstractly-requested features would take time, and lots of it — probably enough for R6RS to appear.

In one respect, in any case, comparison with SRFIs is misleading: Scheme needed the SRFIs to allow people to decide on a shared vocabulary for completely ordinary programs, for the utility functions that previously everyone was having to implement for themselves: it's not hard to write remove-if-not, but if everyone has to do it then that's so much less time to spend on solving the actual problem. In other respects, of course, the SRFIs are doing useful work, but a brief perusal seems to indicate that most of the implemented SRFIs are essentially replicating the standard Common Lisp “library” (for want of a better word).

So what should CLRFIs be doing, then? Well, if I had infinite free time I might write an external-format CLRFI, to harmonize treatment of different character sets when talking to the outside world. A networking CLRFI might be nice, but I think it's probably still not clear how to make it sufficiently platform-independent as to allow natural implementation everywhere CL already exists; it might be better to attempt to standardize higher-level networking (at the protocol level) and allow implementations to use whichever socket layer is to hand. I've already discussed chalst's desire for continuations: there are portable libraries for achieving single-shot continuations, or macros enabling a natural expression of continuation-passing style, but requiring implementations to provide native full first-class continuations is a non-starter. Hygenic macros in Common Lisp would be a solution to a non-problem: given the years of experience in programming in the language (not my years alone: other people's years too) it is clear that the lack of hygiene in the macro system is simply not an issue to the Lisp community. As for #! support, it's really not clear to me why that should need to be available in a standard fashion: given that it's clearly a deployment rather than a development issue, by that stage an implementation choice will have been made, and one can simply use that implementation's facility.

That leaves, from chalst's wishlist, an FFI (where the "F" there actually means "C") standard. And here there is definitely a case for wanting firmer ground to stand on: people using the language at this very moment would prefer the various Lisp vendors to get over their issues with each other and just coöperate. The good news is that it's not all bad news: UFFI is a “user-space” solution to some of those needs (though not all, because by its nature it supports the intersection of available features). A good first step might be a survey of the different implementation's provisions with a view to establishing a reasonable union of features to request from the vendors, and then applying pressure or submitting patches. (Which reminds me: when I have enough free time, unless someone does it before me, I must look at the user-contributed suppport for callbacks on the PPC... time is the key).

Here's a less silly screenshot reflecting the progress of SBCL's upcoming Unicode support. A limited number of external formats have been implemented: enough to act as proof of concept, as demonstrated there. The TODO file is beginning to read more like notes of the process we went through rather than the mountain that has yet to be climbed, which is definitely cheering.

In other news, I went to Amsterdam last weekend for a local lispmeet (well, it was flagged as benelux, but when you're getting people from the UK and Sweden, it's looking more than just that.) There was much of interest, from Peter van Eynde's stealth introduction of Lisp resulting in him being “forced” to do Unix systems engineering in his favourite language, through Edi Weitz' rdnzl .net/Lisp bridge, to the ever-present question of evolution of Lisp: this time Nick Levine represented the clrfi-editors (those who know Scheme will spot the similarities with the SRFIs) with a plea for a high-quality submission to see if their process could be unblocked.

On this last, my view is that the Lisp community, whatever that is, certainly needs a process to evolve the language (preferably backwards-compatibly, but some things might need to be removed root-and-branch), it also needs people to do the work to produce high-quality standardization material: and it's more the lack of that that's lacking, and no amount of process will generate it. I'd love to be proved wrong.

SBCL Unicode work is progressing well. It's nearly at the stage where it becomes useful: it's already at the stage where we can do silly demos such as in this screenshot. Clearly, this isn't going to make it into sbcl 0.8.15; however, I have hopes that it will into 0.8.16 (or should that be 0.9.0 or even 0.1point0.pre1? We still have a little work to do before everybody's happy with supporting a 1.0 release: the issue isn't so much the stability of the code (which was engineered to last — parts of it are decades old!) as the desirability of supporting a suboptimal interface as stable; the main benefit from a developer's point of view of a pre-1.0 release is the ability to laugh away complaints about lack of backwards compatibility. Such days may soon be behind us.

We live in interesting times. SBCL has acquired a new web page layout (with some extra content, too!); Nathan has almost completed the 64-bit work which I dropped after submitting my thesis; Nikodemus, when he returns from studying hard, is likely to give us linkage tables (you can solve everything in Computer Science with an extra layer of indirection) and a single stepper; and I, along with some friendly helpers (and armed with four years of discussions on #lisp IRC), have started the slow process of making SBCL Unicode-aware.

With luck, people will use these exciting new features. It's sometimes hard to tell whether one is programming into a vacuum: I count myself lucky that I can work on SBCL projects for intellectual satisfaction, though I confess that I do prefer it when there is visible positive feedback from actual users...

Edinburgh was fun. Hard work, but fun. Singing with a nice bunch of people is always a good thing, and when we're doing such challenging masterworks as Tallis' Spem in alium (where everyone has to pull their weight: it's for eight choirs of five parts each...) it's good for the brain cells too: certainly better for them than the post-concert social occasions, aided and abetted by Scotland's sane licensing laws — unlike in England, alcohol may be served after 11:00pm.

Juho Snellman is a smart cookie.

SBCL is a large, complex beast: I've talked enough about how hard it is for a system to give the right answer, but when people start wanting it to give the right answer quickly, well... Juho Snellman has single-handedly found improvements to string hashing routines, to poor scaling behaviour with multiple threads, and now bignum division. It appears not to solve the performance problems that we're having with GSharp (which are probably algorithmic in nature: why are we doing arbitrary-precision calculations, anyway), but I should hope that some of the graphs in Andreas Fuchs' boinkmarks (once the nightly run has completed) will look interesting.

I've put up version one of proceedings of the lightning (5-minute in theory, closer to 10-minute in practice) talks from the Bordeaux Libre Software Meeting Lisp track. Not everyone made my arbitrary deadline, but most did: and for those who didn't, there is the possible lifeline of a version two at some point in the future. (I hope we'll get Gilbert Baumann's thoughts on reader/editor integration, and Tim Moore's on CLIM interoperability: it was interesting stuff).

Comments and corrections are welcome: either to the e-mail addresses listed therein, or on freenode's #lisp channel, where quite a lot of the authors hang out.

As for my “keynote” talk, which I'm conscious that I haven't discussed yet: essentially, I attempted to make the case for protocol-oriented programming.

“What is protocol-oriented programming?” the cry rang out. Well, it probably already has a technical meaning (I confess: I didn't check), but the way I mean it is as at least cursory thinking about whether it makes sense to document the means by which a program (or function, or basic computational unit) achieves something, as well as what it is intended to achieve.

Why would one want to do that? Well, in languages less impoverished than C and its cousins, it is possible to hook into provided functionality: Lisp in particular is very trusting, allowing the user (or, perhaps the author of library Y) to override behaviour in library X. A protocol is then the contract between these two libraries (and any overriding that has occurred) so that the resulting mixture continues to function as expected.

An obvious example of a protocol in this sense is the CLOS Metaobject Protocol: libraries or users are allowed to selectively override slot-value-using-class, but if it's overridden inconsistently with (setf slot-value-using-class), expect Bad Things to happen (so Don't Do That Then). However, I believe that there's much greater scope for protocol-oriented programming than has hitherto been realized: for instance, I bet the McCLIM people would love a protocol for fontconfig, rather than an opaque library. And so on, and so on.

Part of the point I was making was that the Lisp community is too small to build isolationist libraries, while undisciplined development (first, take a copy of the library; then, modify the source) leads to a scaling problem and a combinatorial explosion, making it difficult for any given user to use the enhancements by two other people. Of course, the downside is that protocol-oriented programming — or exposing your internals (see, it even sounds disgusting) — is difficult. Film at 11.

78 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!