Older blog entries for joolean (starting at number 68)

r6rs-protobuf

In the wake of the glorious Guile 2.0 release, I've been working on a prototype re-implementation of GIMP's Script-Fu plugin system that uses Guile in lieu of their embedded TinyScheme. I actually got reasonably far, but then the onboard video on my beloved ZaReason UltraLap SR straight up died, so I had to sideline that project while I waited for a replacement (in the form of a new ZaReason Terra HD) to arrive.

Although I'd temporarily lost access to my GIMP patches, I did have comfortably obsolete desktop machine available to me to use to satisfy my constant, ravening need to write programs all the time. I decided that doing an R6RS Scheme implementation of Google Protocol Buffers would make a cool, discrete project: I use and enjoy protobufs at my day job, and I had a hunch that their requirements would map nicely onto the features provided by R6RS records and enumerations. It turns out I was right about that, although I've now sunk a bit more time than I was expecting into the project. But I did just manage to get an initial import of some working libraries uploaded to Google Code. Check it out here.

Guile 2.0

...is finally out! Publicity follows:

And we got a link on the GNU Project home page!

Ludovic Courtès, who actually assembled the release, notes in the release docs that 2.0's been in the works for three years. I could've sworn it was longer than that, maybe because the impetus for the changes in this release has been building for quite some time. I've been working on Guile (and whining about it, and more often than not sitting back and watching other people do brilliant things in it), at various rates of productivity, since I left school eight (!) years ago, and consequently I've been able to watch it as it transformed from a project that was more or less in a holding pattern into one that's rapidly improving and incorporating modern language features without compromising the level of stability it's known for.

Even though I came to it because of its designation as the "official" GNU extension language, I've always liked Guile -- I've found it to be more accessible than other Scheme platforms for both the embedded and interactive use cases -- but I'm aware that some people hold a negative opinion of it. Whatever your past experience with Guile, I think it's changed enough with this release that it's strongly worth another look. The pain points have been addressed head-on, and the good parts have gotten even better. Look at that feature set and tell me you don't start imagining applications.

Many, many thanks to Ludovic and to Andy Wingo, who tackled the hardest problems and accomplished some unbelievable things.

SCSS

I've just released SCSS 0.4.0, which brings with it some pretty major changes. For one, like the recent release of SDOM, it repackages the code as an R6RS library. Not having to chase Scheme platform-compatibility issues is an incredible relief. But I also decided to change the way queries to the cascade work.

In previous versions, you'd specify a node and a property whose value you wanted to obtain for that particular node. The API guaranteed you a non-null response to your query, even if satisfying it meant that additional computation had to be done, like extracting a sub-value from a more general property for use as the value of a more specific property that wasn't set (e.g., figuring out border-top-style from the value of border); or re-using the result of repeating the query for an alternate, related property (e.g., obtaining the value for border-color from color); or just looking up a default value in a reference table. To offset the significant cost of repeat lookups to the cascade, which are not cheap by any means, I'd devised an elaborate system of cascading caches to store values that were discovered but not currently required. This worked reasonably well, but brought with it some frustrating complexity -- for example, those caches had to be flushed based on criteria external to SCSS, like the activation of pseudo-classes and pseudo-elements.

So I've decided to ditch all of that internal complexity at the cost of passing on a fraction of it to the user of the API: I've modified the selection API such that you no longer ask about specific properties, and you get back the complete set of property values that have been assigned to the node. If you need to increase their specificity or look up a default value, there's an auxiliary API for doing those things. It's already had a positive impact in the only client of SCSS that I know of, libRUIN, enabling some more accurate inheritance behavior, and allowing me to scrap a major swath of CSS lookup code that was coupled to assumptions about the way SCSS worked. It would be nice to think that I've reached a point in my development as a software designer that I could afford to make some tentative pronouncements about "best practices." So: APIs that purport to hide, in one shot, all of the complexity that's inherent in a particular source of data are almost certainly not going to be able to make good on that claim. Better to design software that addresses the problem in layers, with each layer solving a very small part.

Another pronouncement I'd like to make is that there's a lot of benefit to be had by taking advantage of language features for structuring data. I'd been relying on Scheme lists for modeling pretty much everything in the system, from the arguments to the value selection calls to the intermediate result values passed around internally. I've since switched those things over to use R6RS records, which makes reasoning about the input and output of functions a lot easier. There used to be a half dozen inputs to the selection API functions, many of them strongly related; I've grouped them into records such that there are now two arguments, one of which will often be invariant over a single application's use of the API.

Phew!

Guile

2.0 is nearly upon us! The release is scheduled for the second week of February. I'm pretty excited about it, given that it represents several years of sporadic work by me, not to mention years of more consistent efforts by lots of brilliant and talented people. We're trying to figure out how to frame the transformative nature of this release -- Guile's finally become what it set out to be, an actual scripting language platform. Now we have to figure out how to communicate that to people. This is where marketing comes in, I suppose.

SDOM

While waiting for Guile 2.0 to come out, I decided to eat some of my own dog food, so to speak, and take on a serious project that made use of Guile's new support for R6RS. I'd been meaning to revisit SDOM for some time now, in order to address some of the shortcomings I'd noticed through using it over the years -- primarily its adherence to SXML's expression format, which was a performance drain when it came to doing things like looking up attribute values and child and parent nodes; and which required a fair amount of nastiness to manage things like metadata on text nodes. Plus I can't think of any other DOM implementations out there that store data "inline" with an XML representation (unless, I guess, they do parsing as well as DOM manipulation).

So I rewrote the thing as an R6RS library using R6RS records to model the Node interface and its children, and released the result as SDOM 0.5. The process was time-consuming but not hellish, per se, mostly due to the fact that I'd had a fairly extensive test suite in place. And working with records instead of semi-circular lists is just so much neater and easier. I'm still trying to gauge its performance relative to the older version; I'm still doing some complicated things, mostly to optimize the read case for complex properties like "wholeText." And, in theory, the whole thing is now portable to every Scheme that supports R6RS, which is getting to be most of them.

Except that...

SXML

...doesn't have a "standard" R6RS library packaging. At the moment, there are a couple of distributions that some industrious people have assembled -- I'm thinking of Wak and Xitomatl -- but these both provide SXML as part of (and with dependencies on) a larger framework of code that I'm just not interested in (and I think one or both of them is missing the crucial make-parser macro). My feeling is that an XML parser is such a fundamental library for a language platform that it's gotta be pretty much an atom. I should just be able to say (import (sxml ssax)) and be done with it.

I wonder if guile-lib's version (now incorporated into Guile 2.0) might make a good candidate for a "universal" SXML R6RS library distribution.

Guile

The ever-resourceful wingo just completed a series of patches that enhances the abilities of Guile's syntax expander. Among other wonders, the semantics of the `@@' form have been extended to support arbitrary expressions. To wit:

(@@ (my-module) (my-expr))

...will evaluate -- but, more importantly, expand -- (my-expr) in (my-module)'s environment.

R6RS

I needed Andy's help with the above because, since last I wrote, I'd gotten pretty far with my implementation of the R6RS standard libraries for Guile. In fact, I'd pretty much finished with them -- all twenty-five or so of them (minus (rnrs bytevectors) and (rnrs io ports), which Ludovic Courtès had already taken on), plus unit tests. And I got them (mostly) done in about one grueling month. (Granted, a lot of them are just repackagings of existing Guile functionality, but...)

Maybe this is how things always go. I spent about a year working on R6RS support in Guile: I was on the Acela back to New York City from the 2009 FSF Associate Members meeting when I started playing around with a naive form of the library-to-module transformer that wingo merged to `master' a couple of weeks ago; and spent the next eight months variously getting frustrated, learning `syntax-case', tweaking, and rewriting, before I got the code to the point where I could knock out all those library implementations in one shot.

And now they'll be in Guile 2.0, which is better than I'd hoped, even.

Guile

wingo just patched a pretty interesting issue I discovered in Guile's psyntax implementation. Guile uses a modified version of the expander that's module aware, so that lookups and bindings are done -- and hygiene is maintained -- with respect to module eval closures. Top-level definition forms, for example, such as `define' or `define-syntax', create bindings in whatever Guile thinks the "current" module is.

This works great, but there's a wrinkle added by the fact that the expansion process can change the current module. `define-module', for example, does this, by way of an `eval-when' -- after creating a new module, the modules system makes it current, so that if you've got a sequence of expressions like:


(define-module (foo))


(define foo 'foo) (define bar 'bar) (define baz 'baz)

...then `foo', `bar', and `baz' will be visible to `(foo)' and not to whatever module was current when you called `define-module'. The issue I found involved a slight modification to this pattern, wrapping everything in a `begin' form:

(begin
  (define-module (foo))
  (define-foo 'foo)
  ...)
Semantically, this should be equivalent to the first form -- at the top level, `begin' splices its contents into the surrounding code as if it weren't even there, enabling you to produce multi-expression, uh, expressions, even in contexts where you're only allowed to produce one, like in the syntax transformer I was writing. But what I was seeing was that my definitions weren't creating bindings in the modules I'd placed them "within" -- particularly hard to troubleshoot when it came syntax definition, since the expander just allowed unbound custom syntax expressions to pass through to the evaluator, assuming they were procedure applications.

The root cause was that Guile didn't anticipate changes to the current module during the expansion of a single top-level form -- so the first form I described would be fine, since the expander would be re-initialized with the current module for each expression; but the second form would only check the current module once, for the `begin', and not after any of the expressions inside it. After Andy's patch, the part of the expander that handles top-level `begin' forms (`chi-top-sequence') checks the current module after each expression in the sequence to see if it's changed and updates the expansion environment appropriately.

R6RS

All of the above has allowed me to make a fair bit of headway on the R6RS front. A while back I'd gotten started on implementations of the R6RS "standard libraries" and started merging them (along with test suites as appropriate) into Guile's module set (see the wip-r6rs-libraries branch), but I'd had to stop because of the difficulties caused by the issue above (which took me a very long time to isolate and properly articulate). Now I can start again!

Guile

...And another thing! This morning wingo pushed a couple of patches for the 1.9.x series that I'd been trying to get in for a while and which add two features that bring the capabilities of Guile's module system up to point where a "userland" (i.e., external to Guile's core) implementation of R6RS library support can be written. They are:

  • Support for the version information as part of Guile's "module" form, along with modifications to Guile's library search mechanism to support searching for modules matching an R6RS-compatible version reference
  • The ability to export a binding from a module under a different name
With these features in place, library support is much closer to being ready in time for 2.0.
Guile

I just pushed a patch for Guile that extends the unreleased 2.0 branch's Unicode support to include title case, as described in the Case Mappings section of the Unicode Standard. It's kind of complicated: In the context of characters, it's used with digraph characters (the canonical example being U+01F3 "dz") whose upcased form ("DZ") isn't appropriate for use at the beginning of word (where "Dz" would be a better fit).

What's interesting is that GNU libunistring, the Unicode library used by Guile defines the contract for uc_totitle such that it the function returns a special title case character, if one is defined for the specified character, otherwise it returns the upcased version of that character. In the context of strings, libunistring's title case mapping puts the first character of each word into title case as above and downcases all the other characters.

Guile has a set of predicates like char-lower-case?, which, under the hood, check for the presence of a specified character in a particular character set. In the original form of the patch, I had added a char-title-case? predicate which did the same for the title case character set. This led to situations in which

(char-title-case? (char-titlecase x))
would be false. We ended up taking it out.
wingo was in Brooklyn last weekend and we got to have a few pints at Double Windsor. We caught up on a bunch of stuff about the upcoming Guile 2.0 release this December -- the tower of compilers; swapping out Guile's legacy, synchronous garbage collector for libgc (which only just happened a week or so ago); and the possibility of porting Emacs' Lisp innards to Guile (which would really be a coup). I know I've said so already, but it's going to be totally sweet. Our meeting inspired me to get back on the R6RS library horse. So I'm working again on a series of patches to allow the use of version specifiers in Guile modules.

This involves some interesting searching / filtering operations, since the way we've opted to handle the wildcards and range specifications in R6RS's version reference syntax requires that the system choose the "best" match from a number of paths. It's not dissimilar to the search procedure used in CSS selector matching, although, thankfully, it's allowed to be a lot greedier.

Guile

Oh, man -- I think I might be the first Guiler on Advogato to announce this: Guile 1.9.0 is out! The 1.9.x series is unstable releases leading up to an eventual 2.0 later this year, and it's packed with enhancements and features that bear evaluation, especially if you've missed them in previous versions of Guile:

  • Guile now sports an actual virtual machine, meaning, among other things:
    • Scheme source can be compiled to bytecode for (much) faster loading and evaluation
    • Guile can finally compile and run code in languages other than Scheme! Initial support for ECMAScript 3.1 is included.
  • Robust multithreading via SRFI-18
  • Syntax-case macros are supported out of the box and maintain hygiene across module boundaries
  • Guile is now unicode-aware and has i18n support
  • Initial support for R6RS's I/O APIs

Grab the tarball here.

59 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!