Been a long time since I updated the diary. There's a few reasons:
But notwithstanding the latter point, I'll throw a few opinions into the ether on what I've read in other diaries. And might as well do an update on WebMake and sitescooper...
Released 0.7. It works quite well, generates sitemaps, breadcrumb trails, back/forward navigation links, and other nifty metadata things. Not sure what needs to be done next... I have a few non-urgent plans:
generate RDF sitemaps
as suggested in Dan Bricklin's paper, URL on the WebMake todo list. This could be cool, esp. if it can be reused to generate RSS "what's new" lists for My Netscape, Scripting News, oreilly.net, etc.
access to stat() data on links
Allow automatic generation of file size info, by making file size a metadatum on a content item -- this'd be handy for download pages.
come up with an intermediate XML format for EtText
caolan suggested this one, and it's a goodie. If EtText generates an XML format instead of plain XHTML, it may be a neat way of (a) allowing more flexible styling of the HTML, (b) allowing other output formats (WML, DocBook, etc.), (c) some neat XSL tricks.
Throw in a CGI which can parse and edit WebMake files and EtText, and you've got good ol' "edit-in-browser" as seen on Advogato, editthispage.com, blogger, etc.
Mebbe I'll just let it get stable first though.
Not much here -- need to fix the NYT login problem (again). Lots of hassle with sites blocking us out of their "AvantGo versions"; AG are taking a strong line with the sites to block us out, it looks like. Nasty.
Mandrake caused a bit of a stink recently, with their announcement that Mandrake News and the Mandrake Forum would be made palm-readable with AvantGo, and not a mention of sitescooper or Plucker. So I've made a site file for MF, which AG still can't handle ;).
Michael Nordström from Plucker asked for the URL of their PDA-friendly version, but no response. hmm.
Maybe we should look into making a sitescooper-on-Mandrake RPM for their Cooker distro, and subvert from the inside ;)
i was going to have to send < and friends because of the break-ups in the data flow: jabber has a wrapper around data called a <stream>. this is where things start to get scary.
It's a nasty problem -- you could try using CDATA sections, which act as unreadable blocks of data, XML tags in there won't get parsed. Not sure how well libxml supports 'em though.
Personal: got the QNX/RTP stuff loaded and working last night. I haven't done much with it yet, but I already know I like it better than anything I've gotten running on Linux. Photon makes X look like the buggy, bloated hack job that it is. I haven't made much use of PhAB yet (the GUI-builder for Photon), and reports indicate it is still unstable, but I'll probably play around with it a bit tonight and see what it's capable of.
I've always been a fan of OSes like VxWorks and QNX because they seem so much *cleaner* than other architectures.
I've been using QNX4 (the previous version before RTP) for the last year + 1/2. It's not much cleaner than Linux, it just has less functionality. And oh, the bugs, don't get me started ;)
Great paper from the O'Reilly OSS Convention in Monterey about Salon's CMS system. Looks cool, must nick some ideas ;)
Hey caolan, re: QNX -- don't believe the hype! It's nice, but not that nice... mark it up as a bit like Be.
Released Sitescooper 3.0.1 today, with quite a few bugs fixed and lots of new sites. It's nice to put that one to bed for a few days; maybe I can get back to WebMake for a while and fix a dependencies-with-perl-code problem.
BTW -- sitescooper users -- note that sitescooper.cx will be disappearing soon. It's sitescooper.org from now on. Those cheap sods in the .cx ccTLD registry folded their "free domains for open source projects" less than 6 months after it was first offered, so I'm f---ed if I'm going to pay them for a .cx after that.
Anyway, nothing I like better in the routine code maintainance dept than firing up the profiler, spotting a hotspot, spending 15 minutes refactoring it and getting a 10% speedup. Beauty!
In other news -- I joined FoRK and got a mail from James Casey, who (a) actually is a friend of Rohit Khare, like the list sez, and (b) I haven't seen in ages. He's apparently off in That London at the mo', but pints will be had next time we're in the same city I should hope.
Argh, netscape 4.75 crashed while editing the diary, probably due to some wierdness where AbiWord mucked up my fonts. Looking forward to an X11 where fonts just work :(
Anyway, released WebMake 0.5 last night.
It's pretty nice already for static, informational sites like homepages etc.; I rejigged the Irish Internet Users pages to use it in 5 minutes, which was handy, and it's a big improvement on what I had there previously.
However I need to add more support for sites where the index page is dynamically generated from a list of static story files. Here's how it works currently:
That's all well and good, but it's not tidy; the Perl code makes it too messy... I think steps 3 to 6 need tidying up, and possibly some kind of no-perl-required way to do it.
Joined FoRK, so now I'm thoroughly snowed ;)
WebMake now has a significant chunk of CMS magic included, in that it can handle metadata and use this to order and query content chunks, in order to generate indices and sitemaps. And better, the dependency checking works with it, so unchanged files do not even need to be read to get their metadata, it's cached in a per-site db file.
BTW the big win of WebMake's dependency support is that it means that WebMake is a CMS which works with web caches nicely. Wes Felter's HtP site brought this point up on the radar last month with a pointer to Resin's caching system.
Anyway, 0.4, just released, does this nicely, and even has some doco ;)
It's getting to the stage where it's satisfied the functionality I needed it to have, so I'll probably be slowing down soon and letting it accumulate some bugfixes and get stable.
One thing first, though: the CVS code now can generate a sitemap using only 3 types of data:
This is a beaut. It means that an RSS site summary file, or even a Slashdot-style "front page", can be generated entirely using a <sitemap> tag. Well, nearly -- I still need to write support for the visibility time range metadata types...
Other thing on the TODO list: allow WebMake to get content from an external command, and write up a doco on how WebMake can be used from within mod_perl to act as a conventional, dynamic-server-pages style system.
Hmm.... wonder what the wiki tag does? BTW still need a project tag ;)
Disappointingly though, I seem to be the only user at the moment. Go on, take a look, it really is pretty neat... ;)
Can't work out why no-one's checked it out, though. Is there some definition somewhere stating that CMSes must be built using dynamic, server-page technologies?
Believe it or not, that is executable -- well, it passes through a preprocessor which generates a normal perl script, but the code is extracted from those pages. So there you are -- your code is (very) heavily documented, and due to the way the function references are hyperlinks, it inherently looks like a LXR.
Incredible! What would you call this? Hypertextual literate programming?
Been on holidays since last Wednesday, so quite a lot of stuff to catch up on. However, MiniNTK sported a link to this beauty:
The hackers are members of a cult based in Finland called The Free Source that, among other things, practices communal ownership of software. Its members release their software under something called the Glorious People's License (or GPL) which basically states that no one can own the software or put restrictions on copying it.
"The Free Source has been recruiting on line for years now," says Ted Phillips, an expert on modern cults, "Their membership probably numbers in the thousands, although it is difficult to tell. They often work by enticing teens and young adults with the promise of free software and beer, before they start encouraging them to read parable-laced screeds that further indoctrinate them into the cult. They have been relatively harmless in the past, but now that they seem to be trying to destroy parents' abilities to protect their children it is clear that they are a danger to our society."
Is this so? If so, where's the beer?! Nobody promised me any beer...
New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.
Keep up with the latest Advogato features by reading the Advogato status blog.
If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!