Older blog entries for jmason (starting at number 49)

Great paper from the O'Reilly OSS Convention in Monterey about Salon's CMS system. Looks cool, must nick some ideas ;)

Hey caolan, re: QNX -- don't believe the hype! It's nice, but not that nice... mark it up as a bit like Be.

Released Sitescooper 3.0.1 today, with quite a few bugs fixed and lots of new sites. It's nice to put that one to bed for a few days; maybe I can get back to WebMake for a while and fix a dependencies-with-perl-code problem.

BTW -- sitescooper users -- note that sitescooper.cx will be disappearing soon. It's sitescooper.org from now on. Those cheap sods in the .cx ccTLD registry folded their "free domains for open source projects" less than 6 months after it was first offered, so I'm f---ed if I'm going to pay them for a .cx after that.

Anyway, nothing I like better in the routine code maintainance dept than firing up the profiler, spotting a hotspot, spending 15 minutes refactoring it and getting a 10% speedup. Beauty!

In other news -- I joined FoRK and got a mail from James Casey, who (a) actually is a friend of Rohit Khare, like the list sez, and (b) I haven't seen in ages. He's apparently off in That London at the mo', but pints will be had next time we're in the same city I should hope.

Argh, netscape 4.75 crashed while editing the diary, probably due to some wierdness where AbiWord mucked up my fonts. Looking forward to an X11 where fonts just work :(

Anyway, released WebMake 0.5 last night.

It's pretty nice already for static, informational sites like homepages etc.; I rejigged the Irish Internet Users pages to use it in 5 minutes, which was handy, and it's a big improvement on what I had there previously.

However I need to add more support for sites where the index page is dynamically generated from a list of static story files. Here's how it works currently:

  1. WebMake file indicates location of one or more story archives, containing 1 story per file

  2. each file can also include meta tags to indicate metadata, like its title, one-line abstract, priority (aka score), section, etc.

  3. some perl code gets the names of all the story content items

  4. perl code then sorts them by section, score and title

  5. foreach item, set title, url, abstract, section, score variables, and fill out a user-specified template with them

  6. set a content item to contain that list

  7. list is written to whatever <out> files it's used in.

That's all well and good, but it's not tidy; the Perl code makes it too messy... I think steps 3 to 6 need tidying up, and possibly some kind of no-perl-required way to do it.

Joined FoRK, so now I'm thoroughly snowed ;)

WebMake now has a significant chunk of CMS magic included, in that it can handle metadata and use this to order and query content chunks, in order to generate indices and sitemaps. And better, the dependency checking works with it, so unchanged files do not even need to be read to get their metadata, it's cached in a per-site db file.

BTW the big win of WebMake's dependency support is that it means that WebMake is a CMS which works with web caches nicely. Wes Felter's HtP site brought this point up on the radar last month with a pointer to Resin's caching system.

Anyway, 0.4, just released, does this nicely, and even has some doco ;)

It's getting to the stage where it's satisfied the functionality I needed it to have, so I'll probably be slowing down soon and letting it accumulate some bugfixes and get stable.

One thing first, though: the CVS code now can generate a sitemap using only 3 types of data:

  • an "up" metadatum, pointing to the content item that is "up" from the current node

  • a "root" attribute on a content item, indicating that it's the root of the content tree

  • a pair of content templates which will be filled out with the details of each node, to generate the list

This is a beaut. It means that an RSS site summary file, or even a Slashdot-style "front page", can be generated entirely using a <sitemap> tag. Well, nearly -- I still need to write support for the visibility time range metadata types...

Other thing on the TODO list: allow WebMake to get content from an external command, and write up a doco on how WebMake can be used from within mod_perl to act as a conventional, dynamic-server-pages style system.

Hmm.... wonder what the wiki tag does? BTW still need a project tag ;)

WebMake now boasts dependency checking, so it won't remake a page that does not contain a chunk of content that has changed. It also now shares a link glossary throughout the entire site (if you use the builtin EtText editable-text format), and does a great job of beautifying the output HTML.

Disappointingly though, I seem to be the only user at the moment. Go on, take a look, it really is pretty neat... ;)

Can't work out why no-one's checked it out, though. Is there some definition somewhere stating that CMSes must be built using dynamic, server-page technologies?

Holy shit -- I've just found HyperPerl on the c2.com Wikibase Wiki. What an insane concept... for an example check out Wiki in HyperPerl.

Believe it or not, that is executable -- well, it passes through a preprocessor which generates a normal perl script, but the code is extracted from those pages. So there you are -- your code is (very) heavily documented, and due to the way the function references are hyperlinks, it inherently looks like a LXR.

Incredible! What would you call this? Hypertextual literate programming?

Been on holidays since last Wednesday, so quite a lot of stuff to catch up on. However, MiniNTK sported a link to this beauty:

The hackers are members of a cult based in Finland called The Free Source that, among other things, practices communal ownership of software. Its members release their software under something called the Glorious People's License (or GPL) which basically states that no one can own the software or put restrictions on copying it.

"The Free Source has been recruiting on line for years now," says Ted Phillips, an expert on modern cults, "Their membership probably numbers in the thousands, although it is difficult to tell. They often work by enticing teens and young adults with the promise of free software and beer, before they start encouraging them to read parable-laced screeds that further indoctrinate them into the cult. They have been relatively harmless in the past, but now that they seem to be trying to destroy parents' abilities to protect their children it is clear that they are a danger to our society."

Is this so? If so, where's the beer?! Nobody promised me any beer...

WebMake 0.1 is released. If you fancy taking a look, it can be found at http://webmake.taint.org/.

I think it's pretty neat; a kind of preprocessing language mixed in with CMS ideas. Certainly makes my own web space easier to manage...

I've been quiet recently. Here's why -- my new project, WebMake. The blurb:

WebMake is a simple web site management system, allowing an entire site to be created from a set of text and markup files and one WebMake file.

It requires no dynamic scripting capabilities on the server; WebMake sites can be deployed to a plain old FTP site without any problems.

It allows the separation of responsibilities between the content editors, the HTML page designers, and the site architect; only the site architect needs to edit the WebMake file itself, or know perl or WebMake code.

A multi-level website can be generated entirely from 1 or more WebMake files containing content, links to content files, perl code (if needed), and output instructions. Since the file-to-page mapping no longer applies, and since elements of pages can be loaded from different files, this means that standard file access permissions can be used to restrict editing by role.

Text can be edited as standard HTML, converted from plain text (using the included Text::EtText module), or converted from any other format by adding a conversion method to the WebMake::FormatConvert module.

Since URLs can be referred to symbolically, pages can be moved around and URLs changed by changing just one line. All references to that URL will then change automatically.

Content items and output URLs can be generated, altered, or read in dynamically using perl code. Perl code can even be used to generate other perl code to generate content/output URLs/etc., recursively.

I'm just polishing it up, registering it on sourceforge, then I'll release an alpha and see what the reaction is. I've been using it for my own sites and it certainly revolutionises the crufty bag of SSIs and hack scripts I was using ;)

Want a clear, concise overview of the state of the art in internet groupware, discussion systems, future plans for same, why XML is good for the web, what Microsoft is on about with .NET, some snippets of Tim B-L's "Semantic Web", etc. etc. etc.? Then print out and read Jon Udell's report for Software Carpentry on the subject. It's very good.

In it he foresees an XML- and internet-based infrastructure for connecting services, "analogous to the UNIX pipeline".

I've been kinda doing this myself by repurposing other people's websites using scripts which talk HTTP, and pretend to be "normal people" browsing -- viz. sitescooper and send-sms-message. But it would be a lot nicer if those sites allowed us to use a clean, well-defined, open interface instead.

The only problem I can see is, how is it worth their while? ie. they cannot display ads in the (machine-readable) XML returned. Micropayments again??

The other question is this -- what's wrong with the UNIX pipeline?? In other words, why isn't there a set of command-line XML manipulation tools for UNIX? As Dan Lyke said:

Wouldn't it be cool to be able to do gzip -dp phonelist.gnumeric | xmlsearch "select phonenumber, longdistanceprice from phonelist.person.work" | xmlsort "person.longdistanceprice"?

Someone else here talked about this concept too, a week or 2 back. Go for it mate, they'll be dead handy tools right now, and everybody'll be thanking you in a year's time...

Too busy -- and trying to release sitescooper 3.0.0 as well.

tetron -- the IFRAME tag IIRC does what you're saying there (inclusion of HTML from another site). The advertising community have been hacking away at that for a while, wouldn't you just know it.

Got a nice plug at TBTF!

That will be all for today's diary I think.

40 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!