20 Sep 2010 joolean   » (Journeyer)

SDOM

While waiting for Guile 2.0 to come out, I decided to eat some of my own dog food, so to speak, and take on a serious project that made use of Guile's new support for R6RS. I'd been meaning to revisit SDOM for some time now, in order to address some of the shortcomings I'd noticed through using it over the years -- primarily its adherence to SXML's expression format, which was a performance drain when it came to doing things like looking up attribute values and child and parent nodes; and which required a fair amount of nastiness to manage things like metadata on text nodes. Plus I can't think of any other DOM implementations out there that store data "inline" with an XML representation (unless, I guess, they do parsing as well as DOM manipulation).

So I rewrote the thing as an R6RS library using R6RS records to model the Node interface and its children, and released the result as SDOM 0.5. The process was time-consuming but not hellish, per se, mostly due to the fact that I'd had a fairly extensive test suite in place. And working with records instead of semi-circular lists is just so much neater and easier. I'm still trying to gauge its performance relative to the older version; I'm still doing some complicated things, mostly to optimize the read case for complex properties like "wholeText." And, in theory, the whole thing is now portable to every Scheme that supports R6RS, which is getting to be most of them.

Except that...

SXML

...doesn't have a "standard" R6RS library packaging. At the moment, there are a couple of distributions that some industrious people have assembled -- I'm thinking of Wak and Xitomatl -- but these both provide SXML as part of (and with dependencies on) a larger framework of code that I'm just not interested in (and I think one or both of them is missing the crucial make-parser macro). My feeling is that an XML parser is such a fundamental library for a language platform that it's gotta be pretty much an atom. I should just be able to say (import (sxml ssax)) and be done with it.

I wonder if guile-lib's version (now incorporated into Guile 2.0) might make a good candidate for a "universal" SXML R6RS library distribution.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!