Older blog entries for DV (starting at number 210)

5 Sep 2005 (updated 5 Sep 2005 at 22:45 UTC) »

Tintin en Irak

Damn someone should translate this satire Tintin en Irak to english (okay there are some French politic bits too), it's hilarious, and really well done, thanks to Uche Ogbuji for finding this excellent piece ! People familiar with TomBoy adventure will appreciate the delicate choice of mapping between the comics characters and the real ones :-)

Update: Wouter Bolsterlee provided me with a ready to print PDF :-), enjoy !

War on Weather

Following the recent events and following the satiric tone of Tintin en Irak here is some of the steps I expect the Bush administration to take following the disaster in the US South:

  • raise a big War on Weather media drama with looping images of Katrina on all TV channels
  • fund a multi billion dollar research program with friends from the military lobby to find if clouds may turn into potentially dangerous weather and to finger print them
  • also fund the deployment of large weapons-like lasers along the US coast to vaporize potentially harmful clouds
  • assign the department of Homeland Security the extra task of fingerprint detection of clouds entering the US territory
  • try to invade Cuba based on the obvious fact that they have developped Clouds of Massive Destruction targetting the US
  • provide the oil industry pumping in the carribean with army support and cloud destruction weapons to operate quietly in the future

The good old techniques always work ...

5 Sep 2005 (updated 5 Sep 2005 at 12:17 UTC) »

libxml2-2.6.21 and libxslt-1.1.15

Finally made a new set of releases, I have been chasing bugs for the last weeks, so people should really update in general. As a result I ended up closing 182 bugs in GNOME bugzilla (manually because the "handle multiple bugs at once" form does not allow to change to CLOSED state so I lost one hour clicking manually on the bug forms :-( ), anyway it's good to have a libxml2 and libxslt bug lists trimmed down to something reasonnable. It's good too to have them in time for the new GNOME release too.

Upcoming events

Don't forget to register for the GNOME Summit if you are coming, also tell us what you want to work on. This clearly won't be a talk driven conference but setting topics of work and discussion in advance will make us more productive.

My talk on Xen at FUDcon is scheduled for Thursday 6th October, I will try to be a bit on the GNOME booth too before and after on that day.

Disaster recovery

Without getting down into the political side, I'm still surprized it takes such a long time for countries affected by a disaster to request or even accept international aid. Seeing the US take a full week before accepting the various kind of help offered worldwide is a bit shocking, it's not like they didn't know they needed it, is that a logistic problem ? But when the Red Cross, an international foreign non-govenemental agency appears to be the most efficient workforce on the scene for 5 full days, and that even the state governemt officiel officially recognize it, this really means the money going to governemental disaster planning or handling should be directly donated to those who know and care about handling those. This also mean the governement is incompetent to handle those. Point is that this fact would not surprize us coming from a less developped country (thailand post tsunami recovery effort looks far more organized in retrospect than US response to Katerina) but again shocking for a nation very eager to point to others how they think a country should be run...

26 Aug 2005 (updated 26 Aug 2005 at 12:56 UTC) »

the future of gamin

John has submitted a patch to have gnome-vfs bypass FAM/gamin and access inotify directly, and people have been asking what my opinion was. Basically it's just fine, there is only 2 concerns: first the OS portability, gamin/FAM back-end for notification should be maintained as a way to keep working on older systems, BSD, MacOS-X, Solaris, etc., second the switch between inotify and gamin should be done at run-time, the reasons are that gamin/fam is legacy and will continue to be around, and by doing so we don't introduce one more binary incomptatibility in the platform, it's hard enough already for ISV shipping on Linux.

But It should be clear too that I find FAM a very ugly, limited and completely underspecified (if specified at all!) API, that it should die as promptly as possible and its only vertue was to be slighly less broken and specific than the various default kernel APIs found on various OSes especailly dnotify ! Die FAM, die ! The ideal situation would be to have a sane POSIX standardized notification API, and just rely on a kernel impleemtation based on syscalls, but I would not hold my breath ! The best future would be for gamin to become a legacy and useless piece of code.


I hacked furiously on libxml2 again this week, first trying to address as many bug reported as possible before the next release (probably at the end of next week) and also found a way to reduce the memory allocator usage of the library which can lead to very significant speedups in some cases as the parsing speed for the database file I use to profile raised from 21MBytes/s to 25MBytes/s on 32 bits and I would expect greater improvements when running on 64bits systems. Kasimier seems to have much work so the progresses on XML Schemas are slowing down, I should use that to also finish the schematron implementation and adding an interface to validate DTD at the SAX level like the interface for SAX and XSD added in July, this should be close to trivial with the existing code.


I will be out of reach until tuesday evening, as I was called to cleanup the land of my mother, there is an alarming number of wildfire around in the South of france and preventive actions is urgently needed. I also discovered that there is frequent Ryanair flights from there to London, which I will use when going to FUDCon3 where I will be speaking about Xen again. I will also go to the GNOME Summit at MIT in Cambridge MA the following week-end 8-11 October.

libxml2 breakage

Sorry I broke CVs head for a day or so, the error didn't show up in my checkout because I compile statically :-\ . Currently going though the libxml2 bugzilla list trying to kill as many bug report as possible, one of them was a real parser bug !</b>


While I sit on IRC all day long, I don't use IM until now. I now have Gaim, I did a review and a small implementation of Jabber a few years ago, I'm very happy of the boost it will receive.

Journey in regexp land

I have been rather quiet in the last 10 days, first due to an extended week-end and then because I hit a relatively hard problem at the regexp level used in libxml2. Basically it's all XML Schemas fault's Kasimier nearly completed the support except the redefine feature allowing a schemas to subset the content model of a type exported by an imported schemas. Kasimier will as usual handle the nasty part of making sense of the spec and I will give him the basic tools to have this work. Which means I need to provide ways to check that a content model is a valid derivation of another, which in regexp terms can be sumarized by: does regexp R accepts all strings generated by regexp r.

And that is a rock, a hard one.

After reading quite a bit, first my existing automata + counters modelling of regexps that we use for content model validation is really not a good model to try to solve this (though it's good for validating instances). So back to the litterature, various papers, most of them relatively recents, especially the paper from Michael Sperberg-McQueen at Extreme Markup earlier this month on on using Brzozowski derivatives for the task. Looks fine except it will explode when using large counter ranges. My selected approach is to do the derivation at the algebraic level instead of doing step by step on all possible input strings, and to fallback to injecting token by token only when no progress can be made purely on tree constructs. A small week of frenetic testing and refinement I now have something which seems to work relatively nicely. I just pushed it into libxml2 , adding less than 8kB of code to xmlregexp compiled size, added a first set to regression testing and support at the testRegexp command line test:

paphio:~/XML -> ./testRegexp --expr '(a*, ((b, c, d){0,5}, e{0,1}){0,4}, f)' '(a{1,100}, b, (c, d, b){2,3}, c, d, e)'
Testing expr (a*, ((b, c, d){0,5}, e{0,1}){0,4}, f):
Subset parsed as: ((((a , b) , ((c , d) , b){2,3}) , c) , d) , e
Resulting derivation: (((b , c) , d){0,5} , e?){0,3} , f
Ops: 0 nodes, 55 cons
paphio:~/XML -> ./testRegexp --expr '(a|b),(a|c){0,100}' 'a{0,100},(a|c)'
Testing expr (a|b),(a|c){0,100}:
Subset parsed as: a{0,100} , (a | c)
Resulting nillable derivation: empty
Ops: 0 nodes, 11 cons
paphio:~/XML -> ./testRegexp --expr '(a|b){3,*}' '(a,b)+'
Testing expr (a|b){3,*}:
Subset parsed as: (a , b)+
Resulting derivation: (a | b)+
Ops: 0 nodes, 8 cons
paphio:~/XML -> ./testRegexp --expr '(a|b),(a|c){0,99}' 'a{0,100},(a|c)'
Testing expr (a|b),(a|c){0,99}:
Subset parsed as: a{0,100} , (a | c)
Resulting derivation: forbidden
Ops: 0 nodes, 9 cons

The key is to try to keep sub-linear performances, I really expect redefines to be used to restrict content models from unbounded sets to bounded and reordered ones for example (a|b)* into (a,b){1,100000} to avoid consumer of services to be DoS'ed, if you explode when validating this is just worse, this is a big problem as pointed recently in a large threads on xml-dev. Hence testRegexp logs the number of Cons i.e. how many time an intermediate expression node was generated (one of Brzozowski results is that this set is finite, but the goal is to keep it small :-).

Future work on this is to fix one potential problem left, apply it to Kasimier code when it's there, extend it to allow the full set of operators needed by Relax-NG and maybe rewrite the RNG validator on top of it. Not sure I will use it for validation in Schemas itself (apart for the Schemas compilation of course), as I prefer good old automatas rather than mutating trees during the validation phase.

Test suite

Very impressed by yesterday SVG test suite results from Uraeus. Looks excellent, congrats ! Now can you automate the process of finding defects in output, I started having an headache approximately 2/3rd in the scanning process :-)

9 Aug 2005 (updated 9 Aug 2005 at 22:13 UTC) »

a new gamin release 0.1.15

One of the strange feeling of becoming an old fart is seeing the youngsters come over your code and replace your slow patient process into a frenetic trance, though with less control. Basically what's happening to me on gamin as John McCutchan is going over gamin's code for inotify. Hence yet another release, and don't ask for 0.1.4 it just disapeared due to last minutes CVs updates.

libxml2 and Schemas

On the other hand I an regaining control over libxmlt Schemas development, Kasimier Buchcik slowed down a bit so I went back and fixed a number of core issues in the libxml2 automata and regexps code. The 2 last bug reports have been from data security companies using Schemas, interesting :-) . I didn't finished yet the schematron code, there is a couple of issues I need to fix first and I would prefer to get a test suite based on the ISO draft standard syntax, and still looking for something like that ...

sunset over the mountains

Sometimes I wonder why I am in Grenoble, away from most events, airport, and from beaches and coral reefs :-), but it usually takes a climb in the surrounding mountains to reassert my love for that area. Pictures of the sunset over Chartreuse on Sunday don't fully give justice to the amazing view (nor the wind or the cold...)

Javascrapt mess

I tried to make a small Javascript slideshow to help scanning my picture, okay I now understand why "web programming" is such a pityful disaster, global variables from the same block of code unreachable at invocation time, a perfect broken kludge for timers loosing all context, I'm sorry for the armies of web developpers worldwide, some "designers" should be hung without much formal process...

DesktopConf and OLS

I'm a bit exhausted after nearly a week of conferences. Very good to (re)connect face to face with other people, the set of talks have been quite good too, today for example have been completely focused on Xen and virtualizattion, but a week in a row is a bit too much. And we still have a Fedora BOF at 9pm !


I still managed to do some code between the conferences, I fixed some of the libxml2 automat/regexps limitations that blocked Kasimier on XML Schemas. I also started implementing the Schematron validation draft ISO standard, it's both relatively simple, and powerful, it's a good complement to XSD or Relax-NG especially for integrity contraints at the document level, but it can also be a validation framework easier to use for people who are not XML gurus. Since it's mostly based on XPath the code on top of existing functionalities should be relatively small in term of code size, which is good too.

Quotes of the day

Damien while honeymooning: "Jonita will probably kill me. I found the bug and a fix ..."

Rik van Riel: "Do you want a VM that is consistently slow, or one that is occasionally fast ? ;)"

Yeah it's Bastille Day and I'm easilly amused today :-)


Following inotify being merged in Linus tree, I had to make a new release of gamin quickly, they changed the kernel API for it again, oh and they promised they would not change it ... again too, but it's fine, I'm quite happy, thanks rlove and Co. ! Gamin-0.1.2 includes the support for the new kernel API thanks to McCutchan, it's availble as testing updates on Fedora Core 3 and 4, but a kernel with inotify may take a little bit.

However it is clear that dnotify is now legacy, further work will be on the inotify back-end, well once I have a kernel for it :-)


2.6.20 release seems a good one, no loud complains, good ! The good point of a official holliday is that I can go chase libxml2 bugs without thinking I should do something else. Today was relatively productive, I got a number of patches post 2.6.20 release (it's interesting to see how each new release tend to generate momentum and you get new comers and new fixes/patches even if unrelated to the relase itself) applied and fixed some of the bugs which were popping up in NIST XML Schemas regression tests:

## NIST test suite for Schemas version NIST2004-01-14
Ran 23170 tests (3953 schemata), no errors

Note for thomasv, "make tests" for libxml2 and libxslt have had a "make valgrind" target associated (running way slower) for more than a year ;-)

We still have a number of XSD test failing however some in the Sun part of the W3C regression suite but most of them in the Microsoft part, problem is to understand what is actually happening, who is right, unfortunately the spec is very very hard to understand, it is not always a clear cut.

DesktopCon and OLS

I'm leaving to Ottawa tomorrow but I will be arriving Sunday evening.

Tour de France ... from the 18th floor

Le Tour de France happened to start from Grenoble today, and pass just below my flat. A tad bit disapointing, there is way way more cars than bicycles, crap advertizing, throwing of cheap goodies to people waiting on the side, noise, ads, and for 30 seconds the actual sport event. People end up seeing way more about ads for sausages, cars, watches, candies, banks than cycling in action... I guess it's the fate for all very popular sport events, it's way more about advertizing and business than sport, same for the Olympics :-) . Anyway I took 3 pictures from the balcony, but i you're really a bicycle fan, watch it on TV !

Afterthough from Havoc entry about Metacity hacking

Seems to me that Havoc's main point was that after the code is mature enough, fixing remaining bugs would just increase the risk of larger bugs being added in the process. I think one can control those risks by adding regression tests for most common uses of the software, and if I understand correctly people are working on GUI regression testing tools for Gnome, so to me the best way to revisit that problem is to start accumulating GUI behaviour regression tests for the desktop.

Legal Spam

So who else got that legal SPAM :

Greetings $project Maintainer:

We currently use the $project package and we are quite pleased with it,
but our legal council has recommended that we discontinue its use. ...

And then 2 Word document with list of files, generated, test suites, etc...
missing a Copyright label. Apparently I'm not the only one, so if you got this
don't worry, you are not the only one. My answer was basically that such
issues should be made public if such legal changes was needed to an
Open Source project, I wondering how they are gonna react :-)

201 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!