22 Oct 2000 jmason   » (Master)

Been a long time since I updated the diary. There's a few reasons:

  • been busy :( -- trying to get up a head of steam to fight software patents in Europe -- Ireland is backing the move, so I'm trying to get some ILUG members (myself included) to fight it. Problem is, I don't know where to start, myself -- letterwriting and political campaigning are not my strong points :(

  • Also, I don't think recentlog.html is scaling, it's too difficult to follow the diaries. Generally if I check my diary the morning after posting, it's already scrolled off. This makes it very tricky to be bothered posting, if there's a 90pc chance no-one's going to read it... after all, who actually goes to a /person page to read their diaries? 's the tragedy of the commons, innit. ;)

But notwithstanding the latter point, I'll throw a few opinions into the ether on what I've read in other diaries. And might as well do an update on WebMake and sitescooper...

---- WebMake

Released 0.7. It works quite well, generates sitemaps, breadcrumb trails, back/forward navigation links, and other nifty metadata things. Not sure what needs to be done next... I have a few non-urgent plans:

generate RDF sitemaps

as suggested in Dan Bricklin's paper, URL on the WebMake todo list. This could be cool, esp. if it can be reused to generate RSS "what's new" lists for My Netscape, Scripting News, oreilly.net, etc.

access to stat() data on links

Allow automatic generation of file size info, by making file size a metadatum on a content item -- this'd be handy for download pages.

come up with an intermediate XML format for EtText

caolan suggested this one, and it's a goodie. If EtText generates an XML format instead of plain XHTML, it may be a neat way of (a) allowing more flexible styling of the HTML, (b) allowing other output formats (WML, DocBook, etc.), (c) some neat XSL tricks.

"edit-in-browser" functionality

Throw in a CGI which can parse and edit WebMake files and EtText, and you've got good ol' "edit-in-browser" as seen on Advogato, editthispage.com, blogger, etc.

Mebbe I'll just let it get stable first though.

---- Sitescooper

Not much here -- need to fix the NYT login problem (again). Lots of hassle with sites blocking us out of their "AvantGo versions"; AG are taking a strong line with the sites to block us out, it looks like. Nasty.

Mandrake caused a bit of a stink recently, with their announcement that Mandrake News and the Mandrake Forum would be made palm-readable with AvantGo, and not a mention of sitescooper or Plucker. So I've made a site file for MF, which AG still can't handle ;).

Michael Nordström from Plucker asked for the URL of their PDA-friendly version, but no response. hmm.

Maybe we should look into making a sitescooper-on-Mandrake RPM for their Cooker distro, and subvert from the inside ;)

---- Comments

lkcl --

i was going to have to send < and friends because of the break-ups in the data flow: jabber has a wrapper around data called a <stream>. this is where things start to get scary.

It's a nasty problem -- you could try using CDATA sections, which act as unreadable blocks of data, XML tags in there won't get parsed. Not sure how well libxml supports 'em though.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!