Older blog entries for DV (starting at number 169)

6 Jul 2004 (updated 6 Jul 2004 at 10:25 UTC) »

Releases

Last releases for libxml2 and libxslt were 6 weeks ago, so 2.6.11 and 1.1.8 are out, they include a lot of bug fixes especially for libxml2. Also there is now a person working on fixing XML Schemas (Kasimier Buchcik) who apparently can make his way though the spec and my code, he started doing regression testing against W3C testsuite. Interestingly he was so far a Windows only user/developper, but as he got involved he's getting into Linux.

GCC improvements

With Arjan van der Ven we played with various options to reduce code size and local library calls costs, we managed to get 5-10% improvement using gcc aliasing for internal libraries call (see elfgcchack.h for the scary but XSLT generated header), the RPM on fedora also uses gcc runtime profiling to try to optimize the resulting code, but this requires gcc-3.4 or later.

But anyway the biggest improvement seems to simply use gcc-3.4, code generation seems way better:

   text    data     bss     dec     hex filename
 937377   32352   35124 1004853   f5535 /usr/lib/libxml2.so.2.6.11
 881072   31608   35828  948508   e791c /usr/lib/libxml2.so.2.6.11

the first one is on RHEL AS 3 with gcc version 3.2.3, the second is on Fedora Core 2 with gcc version 3.4.0, nice work from the GCC hackers, probably the SSA improvements.

Update: I was told 3.4 doesn't have SSA, so expect even better code for 3.5 :-)

Back !!!

Whoohoo, advogato is back, net result is 2 months without any blogging, I actually tried to install mod_virgule on veillard.com, but realized that the version in CVS depended on Apache v1. Here is a quick fast scan back on the last 60 days or so...

Jamaica

Had some vacations there after the stay in the US, tried the underwater housing for the Canon S50, I'm very happy with the results though the reefs in Jamaica were not in a very good shape. Very nice country overall !

Rpmfind

The nice guys from the W3C system staff sent me back the corpse from rpmfind.net, apparentl they tried to make it digest a Windows XP boot floppy, that may explain the system premature death ;-) . I reshaped the system, removing the 13 and 17 Gigs drive, adding a couple of 120Gigs, new mobo, decent CPU, new RAM, and merging with some of the drives from fr.rpmfind.net . After a couple of erratic days chasing troubles due to bad memory, the system is back on-line at INRIA, and should provide stable services hopefully for a few more years. I'm knoocking on wood !

gamin

I have been asked to provide a reduced but simpler and hopefully more secure (at least compatible with SELinux) replacement to the fam library. I released gamin-0.0.1 which should be API and ABI compatible with fam, but does not work with a global system wise daemon. It's per-user (or per session) and seems good enough to get alteration updates in Nautilus. hadess already pointed out a few errors which are fixed in CVS and I hope to push 0.0.2 to Fedora Core soon. The code is in part based on the marmot code and is LGPL.

Sunsets

Very different ones, from the beach of Negril in Jamaica (my current background picture) and from the Dent de Crolles a mountain here, one week ago. Maybe this attraction for sunsets is just a proof of total lack of artistic sense...

Relax-NG compact Syntax

Since jamesh asked for it, well I have most of it implemented already, it's in libxml2 CVS as the rngparser.c module, but I didn't finished the import and include parts and the overall glueing with the normal relaxng.c module. Probably a day of hacking needed, I didn't touched it since February or so, I didn't know there was some demand from within the GNOME camp.

In the US

I'm currently in MA at the Red Hat office in Westford after a few days like all others "I am Red Hat"'ers in Raleigh for the company meeting. Getting into the Desktop team here, lot of thinking and meetings, most stuff will show up at some point on the Fedora desktop list. Wednesday there was an informal GNOME meeting at the Flat Top Johnny in Cambridge around beer and pool tables, it was fun and really good to see some of the Ximian (and Sun ;-) guys.

Tomorrow I will meet William Brack for the first time (he lives in Hong Kong), so I will be around Cambridge, it is likely we will end-up at the Flat Top Johnny in the afternoon or evening to discuss around beers.

Tourism

The week-end was mostly spent doing tourism in Boston and Cambridge Saturday with Mark, Alex and Caolan, we started downtown, went to the Aquarium, then walked though Boston, MIT and Cambridge up to Harward Square. We went to the coast on Sunday, walked through Salem, then drove to Gloucester to see the sea. We bought 4 big lobster there, which got cooked following the famous recipe "Homard a l'americaine", it was really good though extremely messy to eat and cooking 4 lobsters in the mini-kitchen of an hotel room was a big mess too. Apparently Mark is a bit scared by my capacity to ingest food.

JBoss

Seeing what JBoss's founder has to say about Open Sourcing Java:
I sat through and [asked myself], 'What's the goodness of open sourcing Java?' And there would be marginal gains that we could have such as speed.[...]
it makes me really wonder about the perception that some Java users really have on Open Source and Free Software. I really can't understand how they expect to make an OSS based business with being so wrong about the core aspect which makes OSS actually work. Very disturbing ...

libxml2/libxslt

Made a new set of releases over the week-end libxml2-2.6.9 and libxslt-1.1.6 are out. I pushed them to the build system as well as the xmlsec1-1.2.5 release from Aleksey Sanin, they should all show up on Rawhide soon. There is a number of bug fixes in libxml2 and the xml:id implementation. Libxslt release includes only a couple of keys related bugs.

the blog effect

I didn't google for libxml2 recently, and when trying yesterday I noticed a clear change from what I was used to, blog entries are now making a significant part of the references to the project. Started to play with Orkut a bit, though the project is fading away (popularity wise) this is a huge amount of metadata, it's clear Google will be able to use it, this holds both promises and is a bit scary. They can now build person profiles, associate it to the home page and infer a lot of valuable relations from those data...

mailman bounces

As usual William Brack delivers what he promises, so there is a Python version of the mailing-list admin script, I also extended the bounce config file over the last week.

Boston

I will be near Boston from beginning of next week up to mid May visiting the other Red Hat folks there, I hope to be able to see other people there, notably the Ximians and William who will fly from California to finally meet face to face :-) . BTW clarkbw never underestimate the horrible traffic mess that is Boston and the name "Turnpike" really means "big mess" as far as I can remember !

xml:id

Fist working draft is out. First implementation hit libxml2 CVS tonight it took less than an hour, including setting up basic regression tests. Not finished of course, we will see how long it takes to bring that one to REC if it ever gets there.

I answered a few questions in the mailing-list about it too.

The germans gets it

While searching on the web I found a CVS web repository at http://cvs.zeit.de/code/view/mod_xslt/cms_extensions.html?rev=HEAD . The Zeit is one of the largest German newspaper, it does a lot of electronic publishing of course and has a CMS system. Big business, industry leader, not computer related, *but* since they developped extensions for Apache and Zope for their CMS, they documented their architecture and created a public CVS base ! They understand that it is not in their business interest to keep it internal and that giving back changes is important.

It's also SAP a leading german IT company who gave back their (aging but solid) Database for which MySQL is apparently providing support now. It seems to me, that the german industry understood the concepts and perspectives of open source and free software earlier, reading their IT press reinforce that feeling.

mailman bounces handling

The perl script I hacked last week-end works fairly well, it reduced the work to a few manual bounces uncaught and approving the valid but blocked posts. In the meantime William Brack rewrote the script in python, I didn't checked it out yet, but I will certainly switch since I feel better maintaining Python code.

TheReg and XSLT publishing

The Register is among the News site I look at from time to time, and I noticed beginning of this week that the format changed. I then later discovered that they are using libxslt to build the pages. I loved the following statement "We're proud that The Register uses valid XHTML and CSS on its pages" with the suggestion to report breakages. Even if TheReg specializes in treating the topics in an sensational way, their technical attitude should be noted too :-)

In general the [Database ->] XML -> XHTML processing though XSLT seems one of the best way to generate well-formed and even valid XHTML, and as long as the transformation is either static or cached, the cost for doing so is reasonable. The full xmlsoft.org site is generated that way, the pages are XSLT produced by the Makefile in libxml2 (and libxslt) doc directory, and also validated against XHTML1 DTDs (which are in the catalog if you installed the xhtml1 RPM) automatically. That way I'm sure that even if the doc content might not be ideal and certainly outdated it's structure and presentation at least are garanteed to be clean.

Formatting

sdodji pointed me at Prince an XML formatter using CSS for the rules and generating PDF and Postscript, which uses libxml2. I tend to agree with him that if commercial implementation are developped then that means that there is a need for such formatter and the Open Source community work like libcroco and sewfox may have a bright future. I just hope that the xmlroff project will continue too, I'm not sure Sun is still actively supporting the project. Formatting is hard but we have most of the infrastructure, maybe aiming at the full XSL is just too hard and a CSS based tool is more likely to get finished, well I hope so.

Mailman lists handling

dsandras pointed out at mladmin.pl to help handling the bounces. To me it wasn't very useful because the only automatif processing was to discard everything or accept everything, plus it's written in perl ... markmc though I was a bit too strongly biased agaisnt the language, and that the code had potential, but I really didn't want to learn per ... so far.

But considering how painful my lists handling have been, I really needed a tool, to I started looking at the perl tutorial online and hack the thing. The results are:

  • perl definitely sucks as a language, thank you !
  • an augmented version of mladmin.pl which will read $HOME/.bounces and filter stuff by default.
  • an example configuration file
  • about a thousand pending bounces less on mail.gnome.org
  • a couple of crontab entries in key machines >:->

Of course William was on it too :

<bill> DV: you cleared the xml-bindings admin page?
<DV> bill: better :-)
<DV> bill: I'm finishing up the perl-based :-( bounces cleaner
<bill> DV: I'm working on a Python-based version, and having a great "learning experience" concerning cookies....
<DV> bill: I afraid I beat you time wise :-)
<bill> DV: but all of a sudden my testing list (xml-bindings) had no more messages :-)

William pointed out too that today's is 4/4/4 which is apparently a very bad date as 4 for chinese is really a bad number.

I still hope he will come with a maintainable Python equivalent :-)

Spring, pictures and travels

Spring is really starting here, so I took my bicycle yesterday and had a nice ride in the valley. I took a few pictures of a very nice plum tree blooming in the valley. I really like the pictures taken by the S50, I'm buying an extra CF card and battery since I expect to go to Jamaica after the trip to the USA next month and I will really try to get good underwater pictures. Plus there is a lot of orchids growing there, I'm really looking forward those vacations.

libxml2/libxslt

Made a new set of release 2.6.8/1.1.5, mostly bugfixes. The website xmlsoft.org is now hosted on a separate machine at INRIA here in Grenoble, the new processor makes searches far more effective, and the dedicated bandwitdh helps too. The accesses increased significantly lately due to the references from PHP main page.

The languages dilemna

So Sun CEO don't want to see Java Open Sourced. This is a major screwup IMHO ! This mean the Open Source world will never be able to use Java as defined by Sun but only reimplementation (likely to be a limited and to some extend incompatible subset) or a different language like Mono/C# . If the goal is to avoid fragmentation it can only be a failure, if the goal is to maintain Java as a "Sun asset" then it might be a success but its value will diminish. This is not the reaction of an enterprise confident in its technical expertise but sounds more about keeping the familly jewells in the safe in case of disaster.

W.r.t. Microsoft granting Royalty Free licence to ECMA 334/335 standards all I saw so far is just a mail archive saying that Microsoft "will" do this, it doesn't sound that great.

One thing is sure to me, dealing with a dying Sun Microsystem over patents or Java branding might not be any easier than dealing with Microsoft about C# rights.

Red Hat

I'm moving to the Red Hat Desktop group internally, I'm quite happy to join that group along with some of the other GNOME hackers, this is exciting but also mean I will have less time left for libxml2 and libxslt, so don't expect XPath2 or XSLT2 implementations in the near future.

18 Mar 2004 (updated 18 Mar 2004 at 12:43 UTC) »

Mono

Quick entry because Miguel wrote:

Microsoft has granted RAND+Royalty Free licenses to any patents they might own that are required to implement the ECMA 334/335 standards. So at least our core VM, classes and compilers are safe from any litigation from *Microsoft*.

Where ??? RAND is not sufficient, and though I have been looking at all informations posted on this topic so far I have never seen a "written" statement from Microsoft about Royalty Free Licence being granted (and to whom) explicitely. This information must show up from a Microsoft spokeperson to have any legal value, right ?

ncm don't worry, a number of people still value lean, fast and robust libraries written in C, when there is a lot of reuse this makes sense, but for developping application on top of them it's clear those new languages are more efficient from a programmer perspective (though memory hungry).

160 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!