Older blog entries for DV (starting at number 213)

GNOME CVS is accessible only with SSH

If you hit this then go read Owen's mail on the change, the reasons and how to best handle it.

Gnome Live CDs

Following Luis' announcement, fr.rpmfind.net now hosts copies of GNOME 2.12 live CDs, if for some reason you can't use Bittorrent, get them there.

GNOME Summit registration

There is only 35 persons listed at the moment, this looks a low number to me, please take the time to register by just adding your name to the list (and possibly to create an account on live.gnome.org if you didn't yet) if you think you are coming, this will help the logistic a lot, thanks !

xml:id is a W3C Rec

Whooohoo ! This is a relatively short and simple W3C specification. For people not familiar with it, IDs in XML are a special type of attributes whose value is used to point inside a document. So if you use an URI "http://example.com/bar.xml#foo" and there is an element in bar.xml with an attribute which is of type ID and of value "foo", then you can reference that element in a completely standard way (via the usual Mime-Type fragment identifier and XPointer). The problem is that to have ID attributes one needed to reference from the XML document and load the DTD where the type of the attributes is defined when parsing which a lot of XML processing avoids anyway. But if you have an xml:id compliant XML toolchain, any attribute named xml:id is of type ID and can be used for pointing inside the document, even in an absence of a DTD. This can be really useful if you design new XML based data type and need a way to key your records and point to them. It will work with XPath too "id('foo')" will directly bring you back the element if found in the document (and since the table is built at parse time it's an instant lookup usually).


After some troubles with rawhide I finally managed to run Fedora with an inotify kernel, then quickly spotted the memory leaks that plagued some of the users on inotified distros, and I released 0.1.6 with the patch and the latest updates to the inotify back-end. Note that the patch works for 0.1.5 you may have to ask Fred Crozat if you need to backport it, he did :-). He also pointed out that valgrind 3.0.1 also added support for the dnotify, so now gamin can also be valgrinded on older kernels which is excellent news (but didn't spotted other leaks). Valgrind rules !

libxml2 and libxslt

Apparently the latest releases were good ones, only one corner case parser bug was found (but it was also in 2.6.20) and a special compilation problem, this is less than the average amount of report after a week in a new release, especially considering the amount of changes which went in.

Barbecue ?

Isn't miguel inviting people to a barbecue where primates are roasted ? Someone call the Bonobo protection league quick !

They killed the bonobo ... you bastards !

5 Sep 2005 (updated 5 Sep 2005 at 22:45 UTC) »

Tintin en Irak

Damn someone should translate this satire Tintin en Irak to english (okay there are some French politic bits too), it's hilarious, and really well done, thanks to Uche Ogbuji for finding this excellent piece ! People familiar with TomBoy adventure will appreciate the delicate choice of mapping between the comics characters and the real ones :-)

Update: Wouter Bolsterlee provided me with a ready to print PDF :-), enjoy !

War on Weather

Following the recent events and following the satiric tone of Tintin en Irak here is some of the steps I expect the Bush administration to take following the disaster in the US South:

  • raise a big War on Weather media drama with looping images of Katrina on all TV channels
  • fund a multi billion dollar research program with friends from the military lobby to find if clouds may turn into potentially dangerous weather and to finger print them
  • also fund the deployment of large weapons-like lasers along the US coast to vaporize potentially harmful clouds
  • assign the department of Homeland Security the extra task of fingerprint detection of clouds entering the US territory
  • try to invade Cuba based on the obvious fact that they have developped Clouds of Massive Destruction targetting the US
  • provide the oil industry pumping in the carribean with army support and cloud destruction weapons to operate quietly in the future

The good old techniques always work ...

5 Sep 2005 (updated 5 Sep 2005 at 12:17 UTC) »

libxml2-2.6.21 and libxslt-1.1.15

Finally made a new set of releases, I have been chasing bugs for the last weeks, so people should really update in general. As a result I ended up closing 182 bugs in GNOME bugzilla (manually because the "handle multiple bugs at once" form does not allow to change to CLOSED state so I lost one hour clicking manually on the bug forms :-( ), anyway it's good to have a libxml2 and libxslt bug lists trimmed down to something reasonnable. It's good too to have them in time for the new GNOME release too.

Upcoming events

Don't forget to register for the GNOME Summit if you are coming, also tell us what you want to work on. This clearly won't be a talk driven conference but setting topics of work and discussion in advance will make us more productive.

My talk on Xen at FUDcon is scheduled for Thursday 6th October, I will try to be a bit on the GNOME booth too before and after on that day.

Disaster recovery

Without getting down into the political side, I'm still surprized it takes such a long time for countries affected by a disaster to request or even accept international aid. Seeing the US take a full week before accepting the various kind of help offered worldwide is a bit shocking, it's not like they didn't know they needed it, is that a logistic problem ? But when the Red Cross, an international foreign non-govenemental agency appears to be the most efficient workforce on the scene for 5 full days, and that even the state governemt officiel officially recognize it, this really means the money going to governemental disaster planning or handling should be directly donated to those who know and care about handling those. This also mean the governement is incompetent to handle those. Point is that this fact would not surprize us coming from a less developped country (thailand post tsunami recovery effort looks far more organized in retrospect than US response to Katerina) but again shocking for a nation very eager to point to others how they think a country should be run...

26 Aug 2005 (updated 26 Aug 2005 at 12:56 UTC) »

the future of gamin

John has submitted a patch to have gnome-vfs bypass FAM/gamin and access inotify directly, and people have been asking what my opinion was. Basically it's just fine, there is only 2 concerns: first the OS portability, gamin/FAM back-end for notification should be maintained as a way to keep working on older systems, BSD, MacOS-X, Solaris, etc., second the switch between inotify and gamin should be done at run-time, the reasons are that gamin/fam is legacy and will continue to be around, and by doing so we don't introduce one more binary incomptatibility in the platform, it's hard enough already for ISV shipping on Linux.

But It should be clear too that I find FAM a very ugly, limited and completely underspecified (if specified at all!) API, that it should die as promptly as possible and its only vertue was to be slighly less broken and specific than the various default kernel APIs found on various OSes especailly dnotify ! Die FAM, die ! The ideal situation would be to have a sane POSIX standardized notification API, and just rely on a kernel impleemtation based on syscalls, but I would not hold my breath ! The best future would be for gamin to become a legacy and useless piece of code.


I hacked furiously on libxml2 again this week, first trying to address as many bug reported as possible before the next release (probably at the end of next week) and also found a way to reduce the memory allocator usage of the library which can lead to very significant speedups in some cases as the parsing speed for the database file I use to profile raised from 21MBytes/s to 25MBytes/s on 32 bits and I would expect greater improvements when running on 64bits systems. Kasimier seems to have much work so the progresses on XML Schemas are slowing down, I should use that to also finish the schematron implementation and adding an interface to validate DTD at the SAX level like the interface for SAX and XSD added in July, this should be close to trivial with the existing code.


I will be out of reach until tuesday evening, as I was called to cleanup the land of my mother, there is an alarming number of wildfire around in the South of france and preventive actions is urgently needed. I also discovered that there is frequent Ryanair flights from there to London, which I will use when going to FUDCon3 where I will be speaking about Xen again. I will also go to the GNOME Summit at MIT in Cambridge MA the following week-end 8-11 October.

libxml2 breakage

Sorry I broke CVs head for a day or so, the error didn't show up in my checkout because I compile statically :-\ . Currently going though the libxml2 bugzilla list trying to kill as many bug report as possible, one of them was a real parser bug !</b>


While I sit on IRC all day long, I don't use IM until now. I now have Gaim, I did a review and a small implementation of Jabber a few years ago, I'm very happy of the boost it will receive.

Journey in regexp land

I have been rather quiet in the last 10 days, first due to an extended week-end and then because I hit a relatively hard problem at the regexp level used in libxml2. Basically it's all XML Schemas fault's Kasimier nearly completed the support except the redefine feature allowing a schemas to subset the content model of a type exported by an imported schemas. Kasimier will as usual handle the nasty part of making sense of the spec and I will give him the basic tools to have this work. Which means I need to provide ways to check that a content model is a valid derivation of another, which in regexp terms can be sumarized by: does regexp R accepts all strings generated by regexp r.

And that is a rock, a hard one.

After reading quite a bit, first my existing automata + counters modelling of regexps that we use for content model validation is really not a good model to try to solve this (though it's good for validating instances). So back to the litterature, various papers, most of them relatively recents, especially the paper from Michael Sperberg-McQueen at Extreme Markup earlier this month on on using Brzozowski derivatives for the task. Looks fine except it will explode when using large counter ranges. My selected approach is to do the derivation at the algebraic level instead of doing step by step on all possible input strings, and to fallback to injecting token by token only when no progress can be made purely on tree constructs. A small week of frenetic testing and refinement I now have something which seems to work relatively nicely. I just pushed it into libxml2 , adding less than 8kB of code to xmlregexp compiled size, added a first set to regression testing and support at the testRegexp command line test:

paphio:~/XML -> ./testRegexp --expr '(a*, ((b, c, d){0,5}, e{0,1}){0,4}, f)' '(a{1,100}, b, (c, d, b){2,3}, c, d, e)'
Testing expr (a*, ((b, c, d){0,5}, e{0,1}){0,4}, f):
Subset parsed as: ((((a , b) , ((c , d) , b){2,3}) , c) , d) , e
Resulting derivation: (((b , c) , d){0,5} , e?){0,3} , f
Ops: 0 nodes, 55 cons
paphio:~/XML -> ./testRegexp --expr '(a|b),(a|c){0,100}' 'a{0,100},(a|c)'
Testing expr (a|b),(a|c){0,100}:
Subset parsed as: a{0,100} , (a | c)
Resulting nillable derivation: empty
Ops: 0 nodes, 11 cons
paphio:~/XML -> ./testRegexp --expr '(a|b){3,*}' '(a,b)+'
Testing expr (a|b){3,*}:
Subset parsed as: (a , b)+
Resulting derivation: (a | b)+
Ops: 0 nodes, 8 cons
paphio:~/XML -> ./testRegexp --expr '(a|b),(a|c){0,99}' 'a{0,100},(a|c)'
Testing expr (a|b),(a|c){0,99}:
Subset parsed as: a{0,100} , (a | c)
Resulting derivation: forbidden
Ops: 0 nodes, 9 cons

The key is to try to keep sub-linear performances, I really expect redefines to be used to restrict content models from unbounded sets to bounded and reordered ones for example (a|b)* into (a,b){1,100000} to avoid consumer of services to be DoS'ed, if you explode when validating this is just worse, this is a big problem as pointed recently in a large threads on xml-dev. Hence testRegexp logs the number of Cons i.e. how many time an intermediate expression node was generated (one of Brzozowski results is that this set is finite, but the goal is to keep it small :-).

Future work on this is to fix one potential problem left, apply it to Kasimier code when it's there, extend it to allow the full set of operators needed by Relax-NG and maybe rewrite the RNG validator on top of it. Not sure I will use it for validation in Schemas itself (apart for the Schemas compilation of course), as I prefer good old automatas rather than mutating trees during the validation phase.

Test suite

Very impressed by yesterday SVG test suite results from Uraeus. Looks excellent, congrats ! Now can you automate the process of finding defects in output, I started having an headache approximately 2/3rd in the scanning process :-)

9 Aug 2005 (updated 9 Aug 2005 at 22:13 UTC) »

a new gamin release 0.1.15

One of the strange feeling of becoming an old fart is seeing the youngsters come over your code and replace your slow patient process into a frenetic trance, though with less control. Basically what's happening to me on gamin as John McCutchan is going over gamin's code for inotify. Hence yet another release, and don't ask for 0.1.4 it just disapeared due to last minutes CVs updates.

libxml2 and Schemas

On the other hand I an regaining control over libxmlt Schemas development, Kasimier Buchcik slowed down a bit so I went back and fixed a number of core issues in the libxml2 automata and regexps code. The 2 last bug reports have been from data security companies using Schemas, interesting :-) . I didn't finished yet the schematron code, there is a couple of issues I need to fix first and I would prefer to get a test suite based on the ISO draft standard syntax, and still looking for something like that ...

sunset over the mountains

Sometimes I wonder why I am in Grenoble, away from most events, airport, and from beaches and coral reefs :-), but it usually takes a climb in the surrounding mountains to reassert my love for that area. Pictures of the sunset over Chartreuse on Sunday don't fully give justice to the amazing view (nor the wind or the cold...)

Javascrapt mess

I tried to make a small Javascript slideshow to help scanning my picture, okay I now understand why "web programming" is such a pityful disaster, global variables from the same block of code unreachable at invocation time, a perfect broken kludge for timers loosing all context, I'm sorry for the armies of web developpers worldwide, some "designers" should be hung without much formal process...

DesktopConf and OLS

I'm a bit exhausted after nearly a week of conferences. Very good to (re)connect face to face with other people, the set of talks have been quite good too, today for example have been completely focused on Xen and virtualizattion, but a week in a row is a bit too much. And we still have a Fedora BOF at 9pm !


I still managed to do some code between the conferences, I fixed some of the libxml2 automat/regexps limitations that blocked Kasimier on XML Schemas. I also started implementing the Schematron validation draft ISO standard, it's both relatively simple, and powerful, it's a good complement to XSD or Relax-NG especially for integrity contraints at the document level, but it can also be a validation framework easier to use for people who are not XML gurus. Since it's mostly based on XPath the code on top of existing functionalities should be relatively small in term of code size, which is good too.

204 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!