Older blog entries for Ankh (starting at number 185)

A couple of years ago (ouch) someone (Rich Salz) had a go at making a verion of my text retrieval package which used autoconf. Unfortunately this broke the code, and I started to have another go but, as with pretty much every time I have tried in the past, I didn't have more than a couple of hours to spend on it in any given month, and that's nowhere near enough time to do anything useful with the GNU configure packages.

If you're interested in helping out (and I'll ask Rich this too) or if there's a better alternative these days, please let me know (liam at holoweb dot net; mention the colour of your socks in the Subject for a faster reply :-) )

I've also been working on making the package play safe with UTF-8. It's not far off, although there are problems with the curses-based viewer and front end (I might just abandon the front end but the viewer is maybe useful). I know there are going to be problems with RTL/LTR indicators and snippets, though. Sigh.

Also finding time to work on my pictures from old books, inbetween a W3C XSL FO Workshop, XQuery and XSLT 2 work and so forth. Plus trying to get the house painted!

Sorry to see Advogato go; it might be over the top to say it saved my life, but, when I first found it, I was working in a very corporate environment where it had taken me six months to get Internet access. So it did save my sanity. Plus I got to meet some really interesting people, both here and in person.

I'm not going to say I have a blog somewhere else. I tried starting an XML blog and there's also an RSS feed for my pictures scanned from old books.

I'd consider hosting Advogato but my Web hosting now for the most part is supported by ad revenue, and I don't think putting ads on Advogato would be appropriate.

House - We need to get the house repainted, but first we have to repair the board and batten siding on the house; I wrote a short note about the process. I hope soon I'll have enough articles about how to do house stuff that I can link them all up together and make something useful.

Open Source - I was at TypeCon in Boston last month, talking with various people about Web font embedding. This week I was on a conference call arranged by Waldo Bastien of the OSDL DTL Technical Board, totalk about fonts for Linux.

Open source operating systems happened largely because programmers used them. In other words, the system had to be one that people wanted to use for programming, so that programmers would naturally gravitate towards it.

Similarly the various Free and open source desktop environments are getting more useable (especially Gnome I think) because of a focus on their (mostly, but not entirely, non-programmer) users.

I'd like to see Linux be a solid platform for the majority of graphic designers. For that to happen, we need quite a few pieces of infrastructure. Some would say we also need a number of applications, such as PhotoShop, InDesign and Quark Express, Illustrator, Freehand, FlightCheck, impositoin softweare, colour management, and of course font management such as Extensis Suitcase. In the open source world most of those applications are replaced by Free (or at least Open) ones, or become facilities provided by the operating system or shared libraries.

For fonts, we need better user-level management, such as the ability to sort and enable/disable fonts by user-defined groups; we need system-level functionality, such as the ability to manage multiple verisons of fonts; we need application and library-level functionality, such as glyph choosers, better support for more scripts and for mixed writing directions, and user interfaces for choosing fonts by script and language, or by glyph coverage.

There's a lot to do, but there are a lot of people interested in making it all work.

Images from old books - I'm up to over 1200 images now, and thinking of spending time on redesigning the site a little, on improving navigation, and especially on improving the metadata to offer more ways to search.

My Web server has, under my home directory, over 76,000 HTML files, so it might take a while to deal with them, even though most (I hope) are generated from XML by scripts, mostly Perl, XSLT or XQuery, or some combination.

I went from Extreme Markup, an XML/markup conference in Montreal, to TypeCon, a type and typography conference in Boston. From people many of whom think that style is just irrelevent rubbish that gets in the way and it's all about to content to people many of whom think content is just irrelevent rubbish that gets in the way of the design :-)

Both conferences were packed with thought-provoking presentations and cool people. TypeCon had more naked ankles. The conferences overlapped (ironically since a major topic at Extreme is computer representation of overlapping and interwoven herarchies), so I missed the start of TypeCon, but I was there in time for the trip to the Boston Printing Museum where there are also some really kind and interesting people, and where you can see literally hundreds of thousands of drawings for (mostly Linotype) typefaces, as well as old presses, a working Linotype machine, several non-working Monotype machines, visit the library, and of course sign up to be a Member :-)

Came back to Montreal by 'plane, now on the train to Belleville (via Kingston) and thence 45 minutes' drive home. I hope I'll find time to work on helping pango/gtk+/fontconfig to cope better with Expert sets and opentype features, although it's more likely I'll do things like send fonts and mockups to developers so they can understand the problems: sharing understanding is more useful than sharing code, sometimes.

Just got back from a business trip to Paris, and on the way home stopped off in Toronto; you can see my snapshots from the 2006 Toronto Pride Parade.

In Paris I had a long conversation with C. Michael Sperberg-McQueen about XML-aware text retrieval. If I could free up a few days I could make my lq-text package have some XML awareness. I might try this in July, since I'm well overdie to make a new release. If that works I'll try using it on my site of pictures scanned from antiquarian books, although the existing XML-Query-based search works pretty well as a starting point.

prozac, one problem with checking for "legitimate" email by eye is the large number of spam message with forged senders. We (W3C) routinely get listed on spamcop's database for sending spam, for example, because of such forgeries. Using SPF, where available, can help eliminate most of these forgeries, although SPF is not without other problems. It's not very friendly to post real email addresses on advogato by the way - the spam crawlers will soon be flooding those addresses even if they were previously not public, so it sets a bad prececent. Use numeric character references to hide the @ might help, although advogato might undo that during processing.

sktrdie, no, GNOME is not a desktop environment created by hackers for hackers. That would be masturbation. It is a desktop environment created by hackers (programmers) and many many other people, for use by both the people who created it and many many other people. You don't get much Open Source kudos for scratching your itch. You get kudos for scratching other people's itches.

Papers papers papers. Next week a trip to Paris for W3C XML Query face to face meetings, colocated with the XML Schema and XSL Working Groups. These meetings are often highly productive and helpful.

One of the papers I have to write is for IEEE Spectrum magazine, introducing XML to engineers. This interests me because engineers are often more focused on immediate problems than on interoperability (and I say this as someone with at least some engineering background). A solution that appears to be less than optimal for their needs may be rejected, even if in fact using it would give a huge benefit that would outweigh the perceived (or real) inefficiencies.

People, in a way, don't benefit from standardised shoe sizes either: you always end up with shoes that don't quite fit right. But you quickly learn which size is closest, and it's massively cheaper than having shoes specially made, so you see fewer barefoot people wandering about. Actually I wish you saw more barefoot people wandering about, not least since shoes are not necessarily healthy, but that's beside the point.

halcy0n, I'm sorry that you are feeling disillusioned. All projects involving multiple individuals have politics, although I've seen quite a few open source projects in which hostile flame wars are the exception rather than the rule. Both Gnome and Mandriva Linux (cooker) seem to me to fall into the latter category. One thing both projects have in common is outreach, that is, that they aim to provide software not only for themselves but for other sorts of people entirely. Perhaps Eric Raymond might call this scratching someone else's itch, I don't know. Another thing they have in common is having a wide age range of people involved, and again, I think that sometimes helps.

sktrdie, you say, The great thing about advogato is its simpleness and elegancy and then you say, I've been thinking in starting an open-source service such as advogato, in the meanwhile add more features to it. Beware that in adding features you will reduce the simplicity, and your project may lose the thing you most desire, by the very virtue of your work on it.

Working a little on the content mismanagement system I use for my pictures as well as for my images scanned from old books. Managed to wreck the RSS feed briefly, but it's back now. Probably just as well.

The cache for the image search engine now gets deleted more aggressively when it's invalidated; it gets up to two or three hundred megabytes per day, which is fine as long as the cached query results are useful. I don't bother with LRU; the pattern is that they are all invalidated if I upload a new image, and since that generally happens at least once a day, all I needed was for the out-of-date files to be deleted automatically. They were in any case being ignored when out of date, so I had got that part right.

Next is to manage a queue of pending images to upload, and to make a suitable front end so that other people can contribute images more easily.

I should mention that I'm interested in other people's collections of high-quality scans; let me know if you find any cool ones :-) and maybe we can merge or I can link to them. High quality ideally means at least 1200dpi scans, though, in most cases, so as to be without murky grey bits everywhere.

All this by way of procrastination: I'm supposed to be working on a paper on microformats for Extreme Markup, which I think of as the XML conference I find most interesting and thought-provoking; I'm also supposed to be working on an article for IEEE Signal Processing Magazine on XML.

Image of the day: The Discovery of Tin in Britain (a cartoon from the 1890s). Caution, it's a bad joke.

The piano finally arrove, so I had to move office to make room for it (it's a baby grand that had been in storage, after being repaired). So I set myself up with a second monitor. The ATI radeon card in this Dell notebook seems unable to drive the external screen at 1600x1200@75dpi, so I'm with a dual head setup. Unfortunately it's fragile: the ATI installer overwrites one of the xorg X11 libraries, which means that the packages tend to fight one another. It's a good opportunity for /etc/alternatives which is used by Mandriva for other purposes.

I'm still going through photographs from 2004, trying to get up to date. Last night I added pictures of Pendennis Castle in Falmouth, Cornwall. Right now I'm just getting the pictures the right way round and giving them very simple captions; later I'll probably do a gallery of the ones I like the most. The Search link currently takes you to the Search pictures from old books page; I'll fix that when I've finished uploading the 2004 photos.

I also made some preliminary notes on Linux font management; this is still very very sketchy, and I'd appreciate contributions by email (liam at holoweb dot net). I'm most interested in font management under Gnome but I'll take pointers for KDE as well. I am not going to add command-line programs that require you to issue SQL statements, because my goal is to get Linux useable by design professionals, whose focus tends not to be in grokking such things.

Every once in a while I'm reminded why I use Mandriva Linux. I watched someone try to plug in a digital camera and upload pictures. Her husband insists on debian (OK, GNU/Debian Linux[tm]) but this meant instead of "plug in the camera. click on the icon that appears on the desktop" it's "find the device in /dev and mount it". A small difference perhaps, but a big one in outlook. Of course one could configure GNU/Debian Linux[tm] to behave the same way but her system administor husband looks down on things that are too easy. So he has a computer system that's designed to appeal to a system administrator who looks down on people who are not system administrators. Maybe Ubuntu would be a good compromise for the pair of them, based on debian but produced by people who care about using computers to do other things.

zanee, you are right: choose your battles.

badvogato, why should I tell you my husband's name when I don't know your name? Pictures of your ankles on a postcard please :-)

OK, I relent, he's called Clyde.

Liam

I got behind with digital photos, so I've been uploading basically unsorted pictures; I'm up to 2004, and in particular to the holiday in the UK that my husband [yes, live with it] and I had in September of that year. Some of the pictures are pretty good, but of course most aren't, so I'll have to try and make some selections eventually.

Last week I spent some time explaining to someone the difference between XML's name-based typing and the structure-based typing that was in an early draft of XML Query. I suppose you could say the structure-based typing was like an early version of the C Programming Language, in which tw types were entirely compatible if they had the same storage classes. You could assign bar = foo, in other words, if the number of bits in each variable was the same (more or less). By 1978 C had evolved past this, and there were implementations in which if you did
    typedef int hatsize;
    typedef int shoesize;

you couldn't call a function expecting a shoesize and pass a hatsize without at least getting a warning. In Java or C++ it would be absurd for an assignment across classes to be anything but an error, regardless of storage sizes. And it's absurd in XML too, in most cases.

Slowly working on my XML blog; I should add stuff about strong typing.

My husband installed a codec for a Web site he trusted, which turned out to've been misplaced trust, as it installed some virulent malware that keeps popping up saying you've been infected with adware or spyware, and need to buy their anti-adware tool. Of course, to make this credible, it also installs some adware in the background.

Part-way through a new Windows installation using the Acer recovery disks, we discovered that one of the disks was missing. And this left the laptop unusable. Well, usable by Linux :-) Luckily, Acer agreed (for a surprisingly small fee) to ship replacement CDs by overnight courier, so we should have them in a couple of days (you have to add a day for the border, usually).

176 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!