Older blog entries for Stevey (starting at number 633)

node.js is kicking me

Today I started hacking on a re-implementation of my BlogSpam service - which tests that incoming comments are SPAM/HAM - in node.js (blogspam.js)

The current API uses XML::RPC and a perl server, along with a list of plugins, to do the work.

Having had some fun and success with the HTTP+JSON mstore toy I figured I'd have a stab at making BlogSpam more modern:

  • Receive a JSON body via HTTP-POST.
  • Deserialize it.
  • Run the body through a series of Javascript plugins.
  • Return the result back to the caller via HTTP status-code + text.

In theory this is easy, I've hacked up a couple of plugins, and a Perl client to make a submission. But sadly the async-stuff is causing me .. pain.

This is my current status:

shelob ~/git/blogspam.js $ node blogspam.js
Loaded plugin: ./plugins/10-example.js
Loaded plugin: ./plugins/20-ip.js
Loaded plugin: ./plugins/80-sfs.js
Loaded plugin: ./plugins/99-last.js
Received submission: {"body":"

This is my body ..

","ip":"","name":"Steve Kemp"} plugin 10-example.js said next :next plugin 20-ip.js said next :next plugin 99-last.js said spam SPAM: Listed in StopForumSpam.com

So we've loaded plugins, and each has been called. But the end result was "SPAM: Listed .." and yet the caller didn't get that result. Instead the caller go this:

shelob ~/git/blogspam.js $ ./client.pl
200 OK 99-last.js

The specific issue is that I iterate over every loaded-plugin, and wait for them to complete. Because they complete asynchronously the plugin which should be last, and just return "OK" , has executed befure the 80-sfs.js plugin. (Which makes an outgoing HTTP request).

I've looked at async, I've looked at promises, but right now I can't get anything working.


Surprise me with a pull request ;)

Syndicated 2013-09-10 17:57:12 from Steve Kemp's Blog

Dynamically discovering settings for a cluster?

Pretend I run a cluster, for hosting a site. Pretend that I have three-six web-nodes, and each one needs to know which database host to contact.

How do I control that?

Right now I have a /etc/settings.conf file, more or less, deployed by Slaughter. That works. Another common pattern is to use a hostname - for example pmaster.example.org.

However failover isn't considered here. If I wanted to update to point to a secondary database I'd need to either:

  • Add code to retry the second host on failure.
    • Worry about divergence if some hosts used DB1, then DB2, then DB1 came back online.
    • Failover is easy. Fail-back is probably best avoided.
  • Worry about DNS caches and TTL.

In short I'm imagining there are several situations where you want to abstract away the configuration in a cluster-wide manner. (A real solution is obviously floating per-service IPs. Via HAProxy, Keepalived, ucarp, etc. People do that quite often for database specifically, but not for redis-servers, etc.)

So I'm pondering what is essentially a multi-cast accessible key-value storage system.

Have a deamon on the VLAN which will respond to multicast questions like "get db", or "get cache", with a hostname/IP/result.

Suddenly your code would read:

  • Send mcast question ("which db?").
  • Get mcast reply ("db1").
  • Connect to db1.

To me that seems like it should be genuinely useful. But I'm unsure if I'm trading one set of problems for another.

I can't find any examples of existing tools/deamons in this area, which either means I'm being novel, innovate, and interesting. Or I'm over thinking...

Syndicated 2013-09-06 08:29:21 from Steve Kemp's Blog

So that forum?

So that forum I mentioned? I've setup a test-installation at:

What does this forum offer? A cross between hacker news and reddit. If the admin of the forums enables it you can create arbitrary tags, and then view them. For example:

  • http://example.com/view/tag-name

It's also very fast, and reasonably easy to customize. Which is good, because the current layout is nasty.

Things I like:

  • Everything is stored in Redis.
  • The code is made of simple primitives which are joined together in a web-application. Which means most of the logic is outside the core.
  • The templates are pretty basic, which means a real designer can do good things.

Not much more to say really; except I've setup a test install and if you wish to login/register and post spam feel free.

Syndicated 2013-09-02 15:11:54 from Steve Kemp's Blog

A shaved head is a sign of a tidy mind?

This week I have mostly:

Knocked up a forum ("gathering")

It is a simple clone of hacker news, storing things in Redis.

None of the aging though, entries are first->last and my tagging support is basic.

I'm in two minds about releasing it. I'm in three minds about deploying it at the location I'd written it for - forums.lumail.org - since there's nothing worse than a forum with no posts, unless it is a forum that used to be popular and is now a wasteland, or circlejerk. (c.f. slashdot. ahem.)

Updated Slaugther

I found a hidden dependency when installing a new slaughter controlled host.

A new release is imminent.

Setup a remote host for backups

Using BigV I configured a system with LUKS-encrypted disks to act as a "remote dropbox" via git-annex, and rsync.

An encrypted volume is manually mounted post-boot. It stays mounted (oops that's bad) but provides security when the guest is offline, or has been retired.

Wrestled with Graphite

Because damn that software is hard to install.

Turned down two weddings

Because I will never shoot a wedding again. And if I did I'd not do it for free for "exposure".

Merged my (digital) music archive with that belonging to my partner

Which took more discussion than a) moving in together, or b) opening a shared bank account.

Secretly decided I do like my kindle

Even though I bought two books from a charity shop tonight I'm almost certainly going to file them away and look for the epub online instead.

He fought with the goblins! He battled the trolls! He riddled with Gollum! The magic ring he stole!

This weekend I'll be mostly offline. Saturating my home broadband while I sync backups.

Syndicated 2013-08-30 19:46:32 from Steve Kemp's Blog

Lack of referrers on github is an annoyance

Github is a nice site, and I routinely monitor a couple of projects there.

I've also been using it to host a couple of my own projects, initially as an experiment, but since then because it has been useful to get followers and visibility.

I'm a little disappointed that you don't get to see more data though; today my sysadmin utilities repository received several new "stars". Given that these all occurred "an hour ago" it seems likely that they we referenced in a comment somewhere on LWN, hacker news, or similar.

Unfortunately I've no clue where that happened, or if it was a coincidence.

I expect this is more of a concern for those users who use github-pages, where having access to the access.logs would be more useful still. But ..

Syndicated 2013-08-26 11:40:41 from Steve Kemp's Blog

Soon it will be time for something different

This weekend I'm mostly alternating between reading, writing, and trying to avoid death by the plague.[*]

I've switched Lumail to using a UTF-8 aware string library, which means we can now handle the obvious case:

-- Prove keybindings work.
keymap['global']['π'] = 'msg("UTF-8 rocks!")'

Similarly we can stuff input into the buffer:

-- Pretend the user typed ":msg ...\n"
stuff( ":msg('π is pie')\n" );

This transition was annoying to handle, but wasn't too difficult. There is only one more major update required, according to the development roadmap, which is to double check that UTF-8 output is correct.

Otherwise I think I'm almost done. In the sense that I don't see anything obvious missing, barring things that won't ever happen such as mutt-style "tag" support.

I've updated the online examples to include some nice code:

I can't claim to have many users, so far the development has been carried out by myself and approximately four other people. But that matters not. I genuinely believe this is a good client and it really suits the way that I handle (large volumes of) email:

  • Show folders with unread mail.
  • Quickly read it.

Allowing you to open multiple folders at once means you get a great view into your currently-unread mail, regardless of where procmail has placed it.

The overriding feeling having "completed" the client is that Lua rocks. I'm torn between wanting to sleep some more, and wondering what other system/package/tool can be extended by Lua. As epiphanies go my on_idle() update takes some beating.

* - I do not have the black death, but I'm not well.

Syndicated 2013-08-17 13:40:52 from Steve Kemp's Blog

Lumail binaries are wheezy only for the moment

This morning I made a new release of Lumail, which recently completed the transition from using mimetic to GMime for all its MIME needs.

I was happy with mimetic, except it didn't have the facility to decode encoded header-values. I wrote some code, but it was broken, so I made the decision that we should move to something that made this easier, and GMime was chosen.

Jeffrey Stedfast was very helpful in answering my questions with near-perfect code samples.

Beyond that this release features some more Lua primitives and a couple of bug-fixes.

The only annoyance is that the version of GMime I'm using, 2.6.x, isn't available to users of the Squeeze release of Debian GNU/Linux. It is available as a backport, but that means building binaries with sbuild is a pain - due to #700522.

So for the moment I've only built binaries for Wheezy users.

ObQuote: "You know who I am, I've not been off TV for that long!" - Alan Patridge, Alpha Papa

Syndicated 2013-08-14 10:10:47 from Steve Kemp's Blog

Lumail now uses GMime

A hectic day and now we use GMime for all MIME-fu.

This has allowed me to decode headers correctly, setup MIME parts properly in outgoing mails with attachments, and cleanup the code-base.

The next release of Lumail will contain basically just this change, as it is pretty drastic. But first I need to work out how to make binaries for Squeeze compiled against the back-ported version of gmime-2.6.x.

(Previously we used libmimetic. Which was awesome in its way, but caused me some pain with RFC 2047 header-decoding.)

Syndicated 2013-08-11 20:48:36 from Steve Kemp's Blog

TAB completion is hard

I've spent the past few days overhauling the TAB-completion which is included in lumail.

Completing a single token is easy, if there is only one match, and you limit yourself to completing at the start of a line. But doing real completion is hard. Consider the case where you want to complete something like this:


Clearly completing the first part "unread_[TAB]" is simple. But to complete "re" to "red" you need to split up your input line into tokens so that you can recognize a quote as a valid completion point.

Similarly you need to split on "(" to allow:

-- show the path to the editor

To allow this to be changed/controlled by the user I defined completion_chars() which contains: SPACE, QUOTE, "(", etc.

I'm pleased with the user-callback for offering completion suggestions, my own is pretty basic and just includes all user-defined functions as well as an address book. The latter allows steve.org.uk[TAB] to complete to: "Steve Kemp" <steve@steve.org.uk> - because we allow matches anywhere in the completion string, rather than just prefix-matching.

I struggled with resolving ambiguities, but now that is handled correctly too. Press "TAB" when there are multiple choices available and you can graphically TAB-through the available choices, or press Esc to cancel.

In conclusion I've spent a few days fighting with user-interface stuff and now the mail-client is better, but I've still to tackle RFC 2047 header decoding because that is really hard!

Syndicated 2013-08-04 09:00:45 from Steve Kemp's Blog

International character sets and encodings are hard.

Today I've made the 0.15 release of lumail, which has several fixups and cleanups.

The previous release included a rewrite of the scrolling code, courtesy of kain88-de. This release fixes a few corner cases in that update which caused empty messages/Maildirs to be highlighted - operating on such ghost-entries would cause a segfault. Oops.

I've received several more great contributions from 7histle, and trou and I'm very happy with the state of the code and the usefulness of the application.

The biggest outstanding issue is RFC 2047 header decoding. Converting subject/to/from fields to readable versions of their encoded form:

Subject: =?utf-8?Q?Blipfoto=20=2D=20Introducing=20the=20all=2Dnew=20Bli

This is annoying because I'm using mimetic for handling all MIME-related code, and this doesn't seem to offer the facilities that I need.

The current plan is to use the RFC-2047 handling from vmime, but I've fought with that library unsucessfully for two days now - and a further complication is that the library is included in Squeeze/Sid, but not the stable release of Debian.

In conclusion I still regard the client as complete, because I'm using it exclusively and I rarely get "foreign" mails. But there is one more push required to fix all the outstanding bugs which generall boil down to:

  • Decode headers properly.
  • Ensure all our input/output is in UTF-8.

Randomly I'm wondering if I can call out to Lua to do the header decoding. Add "on_header_field()" and display the results. So today I'll be looking at how sensible that is, probably not very.

Syndicated 2013-07-26 09:25:06 from Steve Kemp's Blog

624 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!