19 Aug 2003 jlouis   » (Journeyer)

Hah, work on MLton support on NetBSD progressed nicely this day.

It starts with I wake up and notice that my preleminary patches are being submitted into the main trunk. Further a couple of minor glitches disappears the next couple of hours, most of the regressions work and all shines.

It doesnt bootstrap completely yet though. More concentrated work is needed if I should kill that odd exception the Module Functor elaborater spits out 20 seconds into the compile. Hopefully I will get it attacked and fixed though.

The Perfect hash function has not been discontinued. In fact what I want is a static set rather than a perfect hash and I am continuing with this plan. Primary problem is that the measurements were done 13 years ago on a mere Sparcstation 20. I think we have seen a bit of changes since then. Cache behaviour comes to mind.

Further specialisation: We want it to work good in situations where the cardinality of the keyword set is small. We do not care for larger keywords. I checked CLRS (Introduction to algorihtms) on the subject, but they assume IntInf/libgmp-like numbers and that is going to cost them. Furthermore they do at least 2 cache line fills per lookup. This is rather good though, but the question is if the GMP thing costs too much.

GPerf cheats. A lookup table, set up properly, does the thing for it. The 13 year old tests show that it is actually slower than for instance a trie but uses more memory. The question here is if increased cache and more behaviour on this changes the game. More research is needed.

Judy is damn interesting. This is a trie, but it has a neat trick. It packs each trie node, since most of them can be sparsely represented. This results in very few cache fetches, minimal trashing and a very fast algorithm, since getting a miss today costs.... a lot.... With CPU's entering 3ghz a miss is so expensive that unpacking the stuff instead actually gives speed. Impressive. There are an article on the abovementioned link which is worth reading.

Work had me port scripts which hook into GNU Mailman. Perl programming in the large scale sucks, but I got to dance with the dbi-proxy in DBI at least. I have nailed the problem down to 2 more scripts, and then the Mailman backend can be seperated from the main server. Could be a nice modularity gain in the monolithic world.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!