Older blog entries for crhodes (starting at number 140)

I feel I don't get to do very much hacking any more.

I shouldn’t complain, really: I have a decent and stable job, which is mostly fun; I have a certain amount of freedom in what I do, as long as everything that has to get done gets done; I work with all sorts of interesting people, both formally and informally. But things that I want to do have to live a long way down the priority queue; preparing lecture materials, paper drafts, committee agendas, bursary agreements, course proposals, courseworks, exams, student feedback, paper redrafts, reports, meeting notes, grant proposal drafts, paper reviews, examiners’ reports, reading lists, grant proposal redrafts, and the like all seem to take priority over even the research on a funded project that I am part of, let alone the discretionary research that I might actually want to do.

So sometimes I have to be sneaky, and combine my hacking with teaching-related work instead. One of the more fun things I’ve learnt over the last couple of years is enough colour theory to be dangerous; it started off because I was casting around for ideas on what to teach students on our Creative Computing programme – and I do teach them about colour, among other things – but it’s sufficiently interesting as a technical area in itself that I can see writing code to illustrate aspects of it. So, here’s a (not very good) colour picker “application” for McCLIM, whose only redeeming feature is that it uses knowledge of the colour attributes of consumer-grade display hardware to present colours of the same intensity together. That’s a bit hard to visualize, so here’s a screenshot, where all the colours in the triangle should seem to have about the same brightness (viewers might need to adjust their viewing angle):

Source code is here; I’m not particularly proud of it, and it needs work in all sorts of directions (optimizing, generalizing, cleaning up). One of the reasons I had put off blogging about this is that I was hoping for a lovely literate-programming system to optimized for single-file Lisp programs to appear, generating HTML and PDF output from minimally-marked-up Lisp code. Sadly, that hasn't happened, and my best attempt can only be described as, well, deranged... so no impeccably formatted and indexed code snippets in this blog, not this time anyway.

I hosted SBCL10 this week; I'll be putting links to materials from the workshop as they come in. Things mostly seemed to work; minor failures along the way (for my reference if, heaven forfend, I organize another similar event) included the approximeeting arrangements with Martin Cracauer and James Y. Knight on Sunday at the Imperial War Museum; the hilarious failure to remember the coffee maker on Monday morning until halfway to the stations; and of course relying on college catering to provide a light lunch at the time booked (rather than failing to do so and needing to be chased). One thing that I think did work well was the format: motivational talks to kick off, then hacking sessions interspersed with lighting talks – there was a good variety of stuff going on and stuff being talked about; even the open session at the end was focussed and productive. Particular thanks go to my local support team: Jamie Forth, Karen Hodgson, Richard Lewis and Wendy McDonald (and thanks to James Knight for the use of his AirPort Express; thanks also to my department for giving me the green light to organize this, along with an initial push.

Did I say “heaven forfend”? Now my attention must properly turn to the 2010 European Lisp Symposium; there is now a website, and the Call for Papers was sent to a wide variety of Lisp-related venues, so hopefully everyone knows about it now. Cunningly, the Call for Papers failed to include any guidance on a page count for submitted papers; 15 pages in the J.UCS style is the limit – but please submit through EasyChair, not to J.UCS!

Some more emacs lisp for interacting with launchpad by email (specifically, with Gnus). Previously, I wrote some code which allowed for easy transfer of a bug report by e-mail to launchpad; I've since adapted that to add a Cc to the original reporter, so that they know the bug has been filed (sadly too late for any of the reports that I have actually filed; maybe this blog can serve as a heads-up...)

However, this doesn't solve the entire issue, which is painless and seamless interaction. Comments to bugs are delivered by mail, and filtered using the X-Launchpad-Bug header to an appropriate mail directory, but replies to those comments need to be cryptographically signed for those replies to be accepted by launchpad. How to do that? Initially, I hoped that there would be some group parameter or posting style which would automatically insert the mml code for signing; suspicion alighted on `gnus-message-replysign', but unfortunately the messages that launchpad sends aren't signed, even if the ones that are sent to it must be.

It would also seem that there isn't an appropriate hook for this; all the hooks I could find seem to be run too early, and attempts to call `mml-secure-message-sign-pgpmime' gave me errors about a corrupt mail buffer (because the body separator hadn't been set up yet). So, instead, I ended up piggy-backing on the code handling gnus-message-replysign anyway, by advising the relevant function as follows:


(defadvice gnus-summary-handle-replysign
  (after handle-launchpad-replysign activate)
  (when (string-match "list.*-launchpad" gnus-newsgroup-name)
    (mml-unsecure-message)
    (mml-secure-message-sign-pgpmime)))

The alert will note that this automatically signs not replies to messages from my sbcl-launchpad buffer, but from any of my list groups matching -launchpad. Is this just speculative generality, I hear you ask? No, because Alastair Bridgewater has kindly volunteered to participate in CLX development and release engineering, and his first and second acts were to set up a mailing list (hopefully permanent, this time, after clozure and metacircles abandonment) and a launchpad bugtracker. So if you've been building up scads of patches and annoyances with my clx branch (or even worse, the ancient 0.7.3 release), now might be a good time to attempt to report the annoyances and integrate the patches; particularly from those still-active projects with heavy CLX use (e.g. StumpWM, Eclipse (no not that one) and McCLIM).

I released sbcl-1.0.33 last week; there's a good amount of new stuff in there, including support for NetBSD on the x86-64, some new introspection and tunable functionality, and also a whole chunk of Unicode and external-format work that I meant to do some months ago.

Today I got round to giving SBCL's website a little bit of an update; not only does it reflect the most recent release – something we've got a little bit lax on recently – but also I have updated my gpg key information. Since Launchpad seems to have taken for our bugtracking needs (at least I'm reasonably content with it), I've also taken the opportunity to link the bug numbers in the news page to the relevant bug entries in launchpad. This is still very Web 1.0, I recognize; I was at a networking event today where the highly eminent keynote speaker spent about 20 minutes essentially saying (and I paraphrase mercilessly) "social web, linked data, this is the FUTURE, it has reached a TIPPING POINT, we are on the road to DATA MASHUP WEB 3.0" and I feel relatively content to stick with my curmudgeonly Web 1.0 view of the world for now.

A while ago now, Nikodemus Siivola moved bug information for SBCL from a flat text file to Launchpad. Historically I have had almost nothing but displeasure working with the "standard" or "industrial" bug trackers; I have found Bugzilla horrible to work with, both as a bug reporter and as an administrator; lighter-weight solutions such as Trac are just about tolerable, but basically anything that requires me to have a Web Browser seems to end up confusing and distracting me. An honourable mention at this point goes to debbugs: being able to report, manipulate, update and close bugs by e-mail is close to my idea of Nirvana.

So, Launchpad. Initially, I was dismayed, because there doesn't seem to be a way of getting notifications of bug updates over RSS, which would be a second-best to getting updates by e-mail. I managed to ignore all SBCL bug reports for a while, but eventually I bit the bullet and signed up (having refused to do so a good long while ago, when shortly after I used Ubuntu's bugzilla to report a bug they closed the bugzilla in favour of launchpad without managing to transfer accounts across.)

A large motivation for signing up was the discovery that launchpad does, in fact, have an email interface to the bug tracker; as long as you can emit GPG-signed mail (which I can), it seems to have all the required functionality for doing things without needing to go near a web browser; I can now receive bug reports and reply to them, and in at least some cases the References: headers in the mail I receive allows my client to thread the discussion properly (I haven't really stress-tested this yet, but it works at least well enough for now.)

If that were all, this would not be news (and not even worthy of a blog post). But now I get to demonstrate my Emacs lisp “scripting” ability, in much the same way as Dan Barlow did for me many years ago: SBCL has a mailing list for reporting bugs, for people who are unsure as to whether their problem is a bug or not, or for people who don't want to go to the trouble to get a launchpad account just to report a bug. When such a report does describe a new bug that we should be tracking, that report needs to make its way to launchpad.

Without too much further ado, I present sbcl-bugs-mail-forward, which constructs a message (almost) ready to be sent:


(defun sbcl-bugs-mail-forward ()
  (interactive)
  (let ((message-forward-ignored-headers "")
        from subject)
    (gnus-summary-mail-forward 4)
    (message-goto-to)
    (insert "new@bugs.launchpad.net")
    (message-goto-subject)
    (message-beginning-of-line)
    (re-search-forward
     "\\[\\(.*\\)\\].*\\[\\(.*\\)\\] \\(.*\\)$")
    (setq from (match-string 1) subject (match-string 3))
    (message-beginning-of-line)
    (let ((kill-whole-line nil))
      (kill-line))
    (insert subject)
    (message-goto-body)
    (insert "Report from " from "\n\n")
    (insert " affects sbcl\n status confirmed\n importance ")
    (save-excursion
      (insert "\n tag \n done\n\n")
      (message-goto-body)
      (re-search-forward
       "^\\(-\\)+ Start of forwarded message \\(-\\)+$")
      (beginning-of-line)
      (let ((kill-whole-line t))
        (kill-line))
      (re-search-forward "^\\(-\\)+$")
      (beginning-of-line)
      (end-of-buffer)
      (kill-region (mark) (point)))
    (mml-secure-message-sign-pgpmime)))

You can tell it's scripting, really: it's an odd mixture of plausible and dubious ways of getting things done: regular expressions to extract the original sender of the report, and to remove the forwarded message information (and the dull advert inserted in the footer by SourceForge's mailing list system). On the other hand, that function, coupled with something along the lines of


(setq gnus-parameters
      '(("nnml\\+private:list.sbcl-bugs"
         (gnus-summary-prepared-hook
          '(lambda ()
             (local-set-key (kbd "C-c C-f")
                            'sbcl-bugs-mail-forward)
             (local-set-key (kbd "S o m")
                            'sbcl-bugs-mail-forward))))))

gives me exactly what I think I want: a simple way of creating, tagging and classifying an entry in the bug tracker from a mail report.

I couldn't find any convenient emacs/launchpad interfaces (or any at all, in fact); I'm not sure this counts as one either, but by all means use, adapt and improve on the above for your purposes – I'll happily take criticism of and improvements to this hack.

As mentioned earlier: the 2010 European Lisp Symposium invites your contributions. Unfortunately, the website for the 2010 event is not set up yet; you can get an impression of what the event is like by looking at last year's website, which in the fullness of time (soon, I hope) will be updated with ELS2010 information. In the meantime, here's the Call for Contributions: we would welcome both papers describing original work, not published elsewhere, and submissions for tutorial sessions. Submission will be through EasyChair's conference management system.

3rd European Lisp Symposium

May 6-7, 2010, Fundação Calouste Gulbenkian, Lisbon, Portugal

Important Dates

  • Submission Deadline: January 29, 2010
  • Author Notification: March 1, 2010
  • Final Paper Due: March 26, 2010
  • Symposium: May 6-7, 2010

Authors of accepted research contributions will be invited to submit an extended version of their papers to a special issue of the Journal of Universal Computer Science (J.UCS).

Scope

The purpose of the European Lisp Symposium is to provide a forum for the discussion and dissemination of all aspects of design, implementation and application of any of the Lisp dialects. We encourage everyone interested in Lisp to participate.

The European Lisp Symposium 2010 invites high quality papers about novel research results, insights and lessons learned from practical applications, and educational perspectives, all involving Lisp dialects, including Common Lisp, Scheme, Emacs Lisp, AutoLisp, ISLISP, Dylan, Clojure, and so on.

Topics include, but are not limited to:

  • Language design and implementation
  • Language integration, interoperation and deployment
  • Development methodologies, support and environments
  • Reflection, protocols and meta-level architectures
  • Lisp in Education
  • Parallel, distributed and scientific computing
  • Large and ultra-large-scale systems
  • Hardware, virtual machine and embedded applications
  • Domain-oriented programming
  • Lisp pearls
  • Experience reports and case studies

We invite submissions in two categories: original contributions and tutorials.

  • Original contributions should neither have been published previously nor be under review in any other refereed events or publication. Research papers should describe work that advances the current state of the art, or presents old results from a new perspective. Experience papers should be of broad interest and should describe insights gained from substantive practical applications. The programme committee will evaluate each contributed paper based on its relevance, significance, clarity, and originality.

  • Tutorial submissions should be extended abstracts of up to four pages for in-depth presentations about topics of special interest for at least 90 minutes and up to 180 minutes. The programme committee will evaluate tutorial proposals based on the likely interest in the topic matter, the clarity of the presentation in the extended abstract, and the scope for interactive participation.

The tutorials will run during the symposium on May 6, 2010.

Programme Chair

Christophe Rhodes, Goldsmiths, University of London, UK

Local Chair

António Leitão, Technical University of Lisbon, Portugal

Programme Committee

  • Marco Antoniotti, Università Milano Bicocca, Italy
  • Giuseppe Attardi, Università di Pisa, Italy
  • Pascal Costanza, Vrije Universiteit Brussel, Belgium
  • Irène Anne Durand, Université Bordeaux I, France
  • Marc Feeley, Université de Montréal, Canada
  • Ron Garret, Amalgamated Widgets Unlimited, USA
  • Gregor Kiczales, University of British Columbia, Canada
  • Nick Levine, Ravenbrook Ltd, UK
  • Scott McKay, ITA Software, Inc., USA
  • Peter Norvig, Google Inc., USA
  • Kent Pitman, PTC, USA
  • Christian Queinnec, Université Pierre et Marie Curie, France
  • Robert Strandh, Université Bordeaux I, France
  • Didier Verna, EPITA Research and Development Laboratory, France
  • Barry Wilkes, Citi, UK
  • Taiichi Yuasa, Kyoto University, Japan

A while ago, I attended the 2009 European Lisp Symposium. Antonio Leitão, the Programme Chair for that event, is now preparing a special issue of the Journal of Universal Computer Science on Lisp: Research and Experience, and while authors of ELS papers are invited to submit substantially extended versions of their conference papers, so too are new contributions being sought: the Call for Papers explicitly welcomes original contributions not submitted elsewhere.

The deadline for submissions is not far away: 19th October 2009, so get scribbling! (And should the worst happen, and inspiration not strike in time, don’t despair: the call for contributions to the 2010 European Lisp Symposium will be published shortly.)

In my previous diary entry, I was all gung-ho and optimistic about actually improving SBCL's support for Unicode; “not all of this is implemented yet”, I said. Shortly after beginning to implement the UTF-16 external format, I discovered that the compiler, on x86 only, was miscompiling memory accesses where the offset was of the form (+ variable constant). Because of the release cycle, I then felt that I had to address that (and certain other miscompilation issues that people noticed), with the result that the major achievement in SBCL's Unicode support in the hot-off-the-press 1.0.31 release is that

the EBCDIC-US external-format is now supported for octet operations (as well as for stream operations).
Lucky EBCDIC users.

I was going to blog about the fact that using org-mode, referred to in my previous diary entry, made the important but non-urgent tasks more visible. I was going to use sorting out backups for interesting data (say, my e-mail archives) as an example, and discuss the solution I came up with, but I have just realised that the trust model is exactly backwards (my server trusts root on the backup machine). This is annoying, because I thought I'd got it right, and because getting it right would have been equally easy.

Oh well. So, instead, I'll return sorting out backups to the TODO (or maybe STARTED) state, and (prompted by some recent discussion on #lisp IRC) I'll blog about SBCL's interpretation of Unicode characters, with the up-front caveat that I'm Not An Expert in this languages, glyphs, graphemes, characters and all that jazz.

Common Lisp's string type is defined to be a vector specialized to hold only characters or a subtype thereof. This definition is already hard to wrap your head around, and has amusing consequences documented here in the past, but I don't want to get into it too much; merely to say that already this definition restricts to a fairly large extent the possible implementation strategies for supporting Unicode.

Why so? Because in Unicode there are several notions of ‘character’, and we have to decide which of them we're going to use as our Lisp character type (and use as string constituents). The simple answer from the implementation point of view (and the route that SBCL currently takes) is to define a Lisp character as an entity corresponding directly to a Unicode code point. This is simple and straightforward to implement, but unfortunately has the side effect of making various Common Lisp string functions less useful to the user.

How so? Well, consider the string comparison functions, such as string=. As specified, string= compares two strings, character by character. In SBCL, then, this compares two sequences of Unicode code points, character by character, for equality. The problem is that this operation doesn't in general have the semantics of ‘string equality’, because in Unicode there is more than one way to encode the same abstract character: for example, the e-acute ‘abstract character’, or possibly ‘grapheme’, e-acute (which is usually displayed ‘é’) can be represented either as the single code point U+00E9, or as the combining character sequence U+0065 U+0301.

So, that's OK; the Unicode FAQ on Combining Marks says that characters and combining character sequences are different, and even implies that programmers should be dealing with Unicode code points (SBCL characters). Unfortunately, Lisp has been around for longer than Unicode, and code has been written essentially assuming that string= performs a language-string equality comparison rather than a codepoint-by-codepoint equality comparison, simply because (pre-Unicode) these two concepts were conflated.

What about the alternative? We could try defining Lisp characters to be abstract characters, represented as combining character sequences. One problem with this idea is that there's the char-code function to implement: for every Lisp character there must be a corresponding unique integer. That's not so much a problem – Lisp has bignums after all – but it will make char-code-limit surprisingly large (in principle, I think every combining mark could be applied to a given base character). This means that we'd lose the ability to represent an arbitrary character as an immediate object, meaning that accessing characters from strings would in general cause heap allocation, and lead to surprises elsewhere in the system.

So, given that we stay with a Lisp character corresponding to a Unicode code point, what other pitfalls and details are there to consider? The memory representation of strings of type (simple-array character (*)) is worth mentioning; because there's a fairly strong cultural expectation of O(1) access time in vectors, we don't do any compression, but simply store each Unicode codepoint in a 32-bit cell. SBCL has a separate base-string representation, where each ASCII codepoint is stored in an 8-bit cell; a long time ago I gave a talk about this.

Also, interpretation of the contents of strings has caused confusion recently. Granted that a string is a vector of (effectively) code points, what does that mean for strings containing surrogate characters (code points in the range U+D800–U+DFFF)? These code points do not correspond to any abstract characters directly; instead, pairs of surrogates are (in certain Unicode encodings, such as UTF-16) interpreted as characters beyond the Basic Multilingual Plane. Some Lisp implementations (such as OpenMCLClozure Common Lisp) go so far to resolve this ambiguity as to forbid the creation of a Lisp character with a surrogate codepoint. In SBCL, however, we take the view that those characters exist, but should not be interpreted in any way; a string containing surrogate pairs should be considered to have individual surrogate characters in it, and no attempt should be made to combine them. If there is data in an encoding which uses surrogate pairs (such as UTF-16), then that data should be read in using the :utf-16 external format, so that no surrogates are present at the Lisp level; an attempt to write out a surrogate Lisp character in a Unicode encoding should generate an error. (NB: not all of this is implemented yet).

All of this merely scratches the surface of Unicode support; I'm hoping to find time to implement better support for finding properties of the Unicode Character Database, and to implement Unicode algorithms for normalization, collation and so on; I'm also planning to tighten up support for the Unicode encodings (to address the potential security issues that exist from nonconforming decoders) and generally to improve support for doing useful things with non-ASCII. As usual, there's likely to be a significant lag between planning and doing...

Although I've used Emacs for well over 10 years, and I've been a pretty serious Common Lisp programmer, I don't consider myself an Emacs power user by a long shot: I don't think I'm at all skilled in its use; I don't customize it very much; and I know almost none of the interesting bits of Emacs Lisp. The only applications I really use within Emacs are Gnus, the News and Mail reader (which I adopted after one too many "please find attached" errors) and, more recently, SLIME (but since much of my Common Lisp work is at the implementation level, the luxury of using SLIME for development is often unavailable.

Recently, though, I'm beginning to suspect that I might get more practice. Firstly, the additional responsibilities of a permanent job in UK academia (essentially boiling down to administration, lots of it) meant that my old strategies for managing my TODO list (scraps of paper, followed by writing in different colours on my office wall, followed when I ran out of wall space by a plain-text TODO.txt file) were no longer sufficient to manage the complexity, and I discovered org-mode.

My use of org-mode has, roughly, removed the feeling that I'm drowning in a morass of multiple different things to do with no idea of what's important or urgent along with a feeling that I've forgotten the really important things, and replaced it by a vague feeling of guilt that I'm not doing all the TODO, STARTED, WAITING items that show up in bold red in my *Org Agenda* buffer (and probably won't, ever). I haven't yet found a good way of integrating notes that I take on my often-disconnected laptop with the master org files kept on my server, but vague guilt is noticeably preferable to drowning, so this is a net win.

Also a net win for me is the new daemon functionality in GNU Emacs 23. My setup for mail is perhaps a little eccentric; I read all my mail, including work mail (over imap), from a machine at home which I might call a "server"; I used to log in over ssh, start emacs -f gnus, and go on from there. With the new daemonized emacs, I only need to start emacs once, and can connect to it later, whether that's from my office workstation, or my laptop at some arbitrary location (assuming that I have Internet access, anyway), or even someone else's Mobile Internet Device.

I found one problem with this new modus operandi, though; once I'd logged out of the shell session that started the emacs daemon, accessing remote files with tramp (over ssh) became more painful, as the ssh credential forwarding provided by the ssh-agent responsible for that session was no longer available. The solution I found was to run an ssh-agent instance within the emacs daemon at startup; I have lightly adapted some code by Will Glozer; with my version, a simple


(when (daemonp)
  (ssh-agent "ssh-agent -d"))
in my ~/.emacs does the job very nicely.

131 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!