So there was one link which I didn't include in the Bayesian round-up in this week's NTK, but might be interesting to folk here:
The Bow Toolkit is a "toolkit for statistical language modeling, text retrieval, classification and clustering". It came recommended by the OpenCola folk. I didn't have enough time (or expertise, really) to have a proper look at it, but it looks pretty nice.
"So who does contribute, and why? Members include famous Linux and BSD (Berkeley Software Distribution) hackers like Jordan Hubbard, Miguel de Icaza, Bruce Perens, Eric Raymond and Jamie Zawinski, although few of them are active on the site. Others purport to be God, Satan, Nietzsche, Richard Stallman and Need to Know's Danny O'Brien, but these claims have yet to be confirmed."
http://www.salon.com/tech/feature/2000/07/18/advogato/index1.html ... we all know it's the same guy
New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.
Keep up with the latest Advogato features by reading the Advogato status blog.
If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!