24 Jan 2010 (updated 25 Jan 2010 at 08:46 UTC) »
I hope that this lineup provides additional motivation for people to complete their submissions!
I shouldn’t complain, really: I have a decent and stable job, which is mostly fun; I have a certain amount of freedom in what I do, as long as everything that has to get done gets done; I work with all sorts of interesting people, both formally and informally. But things that I want to do have to live a long way down the priority queue; preparing lecture materials, paper drafts, committee agendas, bursary agreements, course proposals, courseworks, exams, student feedback, paper redrafts, reports, meeting notes, grant proposal drafts, paper reviews, examiners’ reports, reading lists, grant proposal redrafts, and the like all seem to take priority over even the research on a funded project that I am part of, let alone the discretionary research that I might actually want to do.
So sometimes I have to be sneaky, and combine my hacking with teaching-related work instead. One of the more fun things I’ve learnt over the last couple of years is enough colour theory to be dangerous; it started off because I was casting around for ideas on what to teach students on our Creative Computing programme – and I do teach them about colour, among other things – but it’s sufficiently interesting as a technical area in itself that I can see writing code to illustrate aspects of it. So, here’s a (not very good) colour picker “application” for McCLIM, whose only redeeming feature is that it uses knowledge of the colour attributes of consumer-grade display hardware to present colours of the same intensity together. That’s a bit hard to visualize, so here’s a screenshot, where all the colours in the triangle should seem to have about the same brightness (viewers might need to adjust their viewing angle):
Source code is here; I’m not particularly proud of it, and it needs work in all sorts of directions (optimizing, generalizing, cleaning up). One of the reasons I had put off blogging about this is that I was hoping for a lovely literate-programming system to optimized for single-file Lisp programs to appear, generating HTML and PDF output from minimally-marked-up Lisp code. Sadly, that hasn't happened, and my best attempt can only be described as, well, deranged... so no impeccably formatted and indexed code snippets in this blog, not this time anyway.
Did I say “heaven forfend”? Now my attention must properly turn to the 2010 European Lisp Symposium; there is now a website, and the Call for Papers was sent to a wide variety of Lisp-related venues, so hopefully everyone knows about it now. Cunningly, the Call for Papers failed to include any guidance on a page count for submitted papers; 15 pages in the J.UCS style is the limit – but please submit through EasyChair, not to J.UCS!
However, this doesn't solve the entire issue, which is painless and seamless interaction. Comments to bugs are delivered by mail, and filtered using the X-Launchpad-Bug header to an appropriate mail directory, but replies to those comments need to be cryptographically signed for those replies to be accepted by launchpad. How to do that? Initially, I hoped that there would be some group parameter or posting style which would automatically insert the mml code for signing; suspicion alighted on `gnus-message-replysign', but unfortunately the messages that launchpad sends aren't signed, even if the ones that are sent to it must be.
It would also seem that there isn't an appropriate hook for this; all the hooks I could find seem to be run too early, and attempts to call `mml-secure-message-sign-pgpmime' gave me errors about a corrupt mail buffer (because the body separator hadn't been set up yet). So, instead, I ended up piggy-backing on the code handling gnus-message-replysign anyway, by advising the relevant function as follows:
(defadvice gnus-summary-handle-replysign
(after handle-launchpad-replysign activate)
(when (string-match "list.*-launchpad" gnus-newsgroup-name)
(mml-unsecure-message)
(mml-secure-message-sign-pgpmime)))
The alert will note that this automatically signs not replies to messages from my sbcl-launchpad buffer, but from any of my list groups matching -launchpad. Is this just speculative generality, I hear you ask? No, because Alastair Bridgewater has kindly volunteered to participate in CLX development and release engineering, and his first and second acts were to set up a mailing list (hopefully permanent, this time, after clozure and metacircles abandonment) and a launchpad bugtracker. So if you've been building up scads of patches and annoyances with my clx branch (or even worse, the ancient 0.7.3 release), now might be a good time to attempt to report the annoyances and integrate the patches; particularly from those still-active projects with heavy CLX use (e.g. StumpWM, Eclipse (no not that one) and McCLIM).
Today I got round to giving SBCL's website a little bit of an update; not only does it reflect the most recent release – something we've got a little bit lax on recently – but also I have updated my gpg key information. Since Launchpad seems to have taken for our bugtracking needs (at least I'm reasonably content with it), I've also taken the opportunity to link the bug numbers in the news page to the relevant bug entries in launchpad. This is still very Web 1.0, I recognize; I was at a networking event today where the highly eminent keynote speaker spent about 20 minutes essentially saying (and I paraphrase mercilessly) "social web, linked data, this is the FUTURE, it has reached a TIPPING POINT, we are on the road to DATA MASHUP WEB 3.0" and I feel relatively content to stick with my curmudgeonly Web 1.0 view of the world for now.
So, Launchpad. Initially, I was dismayed, because there doesn't seem to be a way of getting notifications of bug updates over RSS, which would be a second-best to getting updates by e-mail. I managed to ignore all SBCL bug reports for a while, but eventually I bit the bullet and signed up (having refused to do so a good long while ago, when shortly after I used Ubuntu's bugzilla to report a bug they closed the bugzilla in favour of launchpad without managing to transfer accounts across.)
A large motivation for signing up was the discovery that launchpad does, in fact, have an email interface to the bug tracker; as long as you can emit GPG-signed mail (which I can), it seems to have all the required functionality for doing things without needing to go near a web browser; I can now receive bug reports and reply to them, and in at least some cases the References: headers in the mail I receive allows my client to thread the discussion properly (I haven't really stress-tested this yet, but it works at least well enough for now.)
If that were all, this would not be news (and not even worthy of a blog post). But now I get to demonstrate my Emacs lisp “scripting” ability, in much the same way as Dan Barlow did for me many years ago: SBCL has a mailing list for reporting bugs, for people who are unsure as to whether their problem is a bug or not, or for people who don't want to go to the trouble to get a launchpad account just to report a bug. When such a report does describe a new bug that we should be tracking, that report needs to make its way to launchpad.
Without too much further ado, I present sbcl-bugs-mail-forward, which constructs a message (almost) ready to be sent:
(defun sbcl-bugs-mail-forward ()
(interactive)
(let ((message-forward-ignored-headers "")
from subject)
(gnus-summary-mail-forward 4)
(message-goto-to)
(insert "new@bugs.launchpad.net")
(message-goto-subject)
(message-beginning-of-line)
(re-search-forward
"\\[\\(.*\\)\\].*\\[\\(.*\\)\\] \\(.*\\)$")
(setq from (match-string 1) subject (match-string 3))
(message-beginning-of-line)
(let ((kill-whole-line nil))
(kill-line))
(insert subject)
(message-goto-body)
(insert "Report from " from "\n\n")
(insert " affects sbcl\n status confirmed\n importance ")
(save-excursion
(insert "\n tag \n done\n\n")
(message-goto-body)
(re-search-forward
"^\\(-\\)+ Start of forwarded message \\(-\\)+$")
(beginning-of-line)
(let ((kill-whole-line t))
(kill-line))
(re-search-forward "^\\(-\\)+$")
(beginning-of-line)
(end-of-buffer)
(kill-region (mark) (point)))
(mml-secure-message-sign-pgpmime)))
You can tell it's scripting, really: it's an odd mixture of plausible and dubious ways of getting things done: regular expressions to extract the original sender of the report, and to remove the forwarded message information (and the dull advert inserted in the footer by SourceForge's mailing list system). On the other hand, that function, coupled with something along the lines of
(setq gnus-parameters
'(("nnml\\+private:list.sbcl-bugs"
(gnus-summary-prepared-hook
'(lambda ()
(local-set-key (kbd "C-c C-f")
'sbcl-bugs-mail-forward)
(local-set-key (kbd "S o m")
'sbcl-bugs-mail-forward))))))
gives me exactly what I think I want: a simple way of creating, tagging and classifying an entry in the bug tracker from a mail report.
I couldn't find any convenient emacs/launchpad interfaces (or any at all, in fact); I'm not sure this counts as one either, but by all means use, adapt and improve on the above for your purposes – I'll happily take criticism of and improvements to this hack.
May 6-7, 2010, Fundação Calouste Gulbenkian, Lisbon, Portugal
Authors of accepted research contributions will be invited to submit an extended version of their papers to a special issue of the Journal of Universal Computer Science (J.UCS).
The purpose of the European Lisp Symposium is to provide a forum for the discussion and dissemination of all aspects of design, implementation and application of any of the Lisp dialects. We encourage everyone interested in Lisp to participate.
The European Lisp Symposium 2010 invites high quality papers about novel research results, insights and lessons learned from practical applications, and educational perspectives, all involving Lisp dialects, including Common Lisp, Scheme, Emacs Lisp, AutoLisp, ISLISP, Dylan, Clojure, and so on.
Topics include, but are not limited to:
We invite submissions in two categories: original contributions and tutorials.
The tutorials will run during the symposium on May 6, 2010.
Christophe Rhodes, Goldsmiths, University of London, UK
António Leitão, Technical University of Lisbon, Portugal
The deadline for submissions is not far away: 19th October 2009, so get scribbling! (And should the worst happen, and inspiration not strike in time, don’t despair: the call for contributions to the 2010 European Lisp Symposium will be published shortly.)
the EBCDIC-US external-format is now supported for octet operations (as well as for stream operations).Lucky EBCDIC users.
Oh well. So, instead, I'll return sorting out backups to the TODO (or maybe STARTED) state, and (prompted by some recent discussion on #lisp IRC) I'll blog about SBCL's interpretation of Unicode characters, with the up-front caveat that I'm Not An Expert in this languages, glyphs, graphemes, characters and all that jazz.
Common Lisp's string type is defined to be a vector specialized to hold only characters or a subtype thereof. This definition is already hard to wrap your head around, and has amusing consequences documented here in the past, but I don't want to get into it too much; merely to say that already this definition restricts to a fairly large extent the possible implementation strategies for supporting Unicode.
Why so? Because in Unicode there are several notions of ‘character’, and we have to decide which of them we're going to use as our Lisp character type (and use as string constituents). The simple answer from the implementation point of view (and the route that SBCL currently takes) is to define a Lisp character as an entity corresponding directly to a Unicode code point. This is simple and straightforward to implement, but unfortunately has the side effect of making various Common Lisp string functions less useful to the user.
How so? Well, consider the string comparison functions, such as string=. As specified, string= compares two strings, character by character. In SBCL, then, this compares two sequences of Unicode code points, character by character, for equality. The problem is that this operation doesn't in general have the semantics of ‘string equality’, because in Unicode there is more than one way to encode the same abstract character: for example, the e-acute ‘abstract character’, or possibly ‘grapheme’, e-acute (which is usually displayed ‘é’) can be represented either as the single code point U+00E9, or as the combining character sequence U+0065 U+0301.
So, that's OK; the Unicode FAQ on Combining Marks says that characters and combining character sequences are different, and even implies that programmers should be dealing with Unicode code points (SBCL characters). Unfortunately, Lisp has been around for longer than Unicode, and code has been written essentially assuming that string= performs a language-string equality comparison rather than a codepoint-by-codepoint equality comparison, simply because (pre-Unicode) these two concepts were conflated.
What about the alternative? We could try defining Lisp characters to be abstract characters, represented as combining character sequences. One problem with this idea is that there's the char-code function to implement: for every Lisp character there must be a corresponding unique integer. That's not so much a problem – Lisp has bignums after all – but it will make char-code-limit surprisingly large (in principle, I think every combining mark could be applied to a given base character). This means that we'd lose the ability to represent an arbitrary character as an immediate object, meaning that accessing characters from strings would in general cause heap allocation, and lead to surprises elsewhere in the system.
So, given that we stay with a Lisp character corresponding to a Unicode code point, what other pitfalls and details are there to consider? The memory representation of strings of type (simple-array character (*)) is worth mentioning; because there's a fairly strong cultural expectation of O(1) access time in vectors, we don't do any compression, but simply store each Unicode codepoint in a 32-bit cell. SBCL has a separate base-string representation, where each ASCII codepoint is stored in an 8-bit cell; a long time ago I gave a talk about this.
Also, interpretation of the contents of strings has
caused
confusion recently. Granted that a string is a vector of
(effectively) code points, what does that mean for strings
containing
surrogate characters (code points in the range
U+D800–U+DFFF)? These code points do not
correspond to
any abstract characters directly; instead, pairs of
surrogates are (in certain Unicode encodings, such as UTF-16)
interpreted as characters beyond the Basic Multilingual
Plane. Some
Lisp implementations (such as OpenMCLClozure
Common
Lisp) go so far to resolve this ambiguity as to forbid the
creation of a Lisp character with a surrogate codepoint. In
SBCL,
however, we take the view that those characters exist, but
should
not be interpreted in any way; a string containing
surrogate
pairs should be considered to have individual surrogate
characters in
it, and no attempt should be made to combine them. If there
is data
in an encoding which uses surrogate pairs (such as UTF-16),
then that
data should be read in using the :utf-16 external
format, so
that no surrogates are present at the Lisp level; an attempt
to write
out a surrogate Lisp character in a Unicode encoding should
generate
an error. (NB: not all of this is implemented yet).
All of this merely scratches the surface of Unicode support; I'm hoping to find time to implement better support for finding properties of the Unicode Character Database, and to implement Unicode algorithms for normalization, collation and so on; I'm also planning to tighten up support for the Unicode encodings (to address the potential security issues that exist from nonconforming decoders) and generally to improve support for doing useful things with non-ASCII. As usual, there's likely to be a significant lag between planning and doing...
FOAF updates: Trust rankings are now exported, making the data available to other users and websites. An external FOAF URI has been added, allowing users to link to an additional FOAF file.
Keep up with the latest Advogato features by reading the Advogato status blog.
If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!