9 Jan 2006
(updated 9 Jan 2006 at 23:09 UTC) »
Spent the day writing a parser for LDML (the XML format invented by the Unicode people for storage locale data: it stands for Locale Data Markup Language) for one of my clients.
This has been brewing a long time. My collation routines have been using a mixture of ad-hoc file formats and general hackery, and if there's now going to be an authoritative repository of locale data maintained by www.unicode.org I am going to use it.
LDML is certainly not mature yet, and even contains an absurd spelling mistake ('quarternary' for 'quaternary', as the name of an attribute and part of the name of another) and is going to be hard to parse, but it's a lot better than the alternatives.
I use Expat for XML parsing. I also use Expat in Cartotype, and it's one of the nicest little open source components I've come across.
My friend Frank came round this evening and had a look at the latest version of CartoType and said some encouraging things, which put some heart into me. We fiddled around with style sheets for a bit to try to get more street names to appear at 1:15000 scale, which is roughly what you get in a printed street atlas, and compared it with Nicholson's Greater London Street Atlas to see how we were doing. Not bad, but I think I'll have to start doing more to fit names into impossibly small spaces. CartoType jiggles each name around to find a place where it doesn't overlap, and then tries abbreviating it and jiggles it some more, and then changes to a condensed font and tries again, which is good, but not good enough: I need to get it to try folding the name on to two lines, which brings in more problems when the name has to be drawn on a curved path, because one of the two lines of text is going to be on the inside of the curve and have to turn a sharper corner, which can look ugly.