Older blog entries for Stevey (starting at number 675)

Discovering back-end servers automatically?

Recently I've been pondering how to do service discovery.

Pretend you have a load balancer which accepts traffic and routes incoming requests to different back-ends. The loadbalancer might be pound, varnish, haproxy, nginx, or similar. The back-ends might be node applications, apache, or similar.

The typical configuration of the load-balancer will read:

# forward

# backends
backend web1  { .host = "10.0.0.4"; }
backend web2  { .host = "10.0.0.6"; }
backend web3  { .host = "10.0.0.5"; }

#  afterword

I've seen this same setup in many situations, and while it can easily be imagined that there might be "random HTTP servers" on your (V)LAN which shouldn't receive connections it seems like a pain to keep updating the backends.

Using UDP/multicast broadcasts it is trivial to announce "Hey I'm a HTTP-server with the name 'foo'", and it seems to me that this should allow seamless HTTP load-balancing.

To be more explicit - this is normal:

  • The load-balancer listens for HTTP requests, and forwards them to back-ends.
  • When back-ends go away they stop receiving traffic.

What I'd like to propose is another step:

  • When a new back-end advertises itself with the tag "foo" it should be automatically added and start to receive traffic.

i.e. This allows backends to be removed from service when they go offline but also to be added when they come online. Without the load-balancer needing its configuration to be updated.

This means you'd not give a static list of back-ends to your load-balancer, instead you'd say "Route traffic to any service that adfvertises itself with the tag 'foo'.".

VLANS, firewalls, multicast, udp, all come into play, but in theory this strikes me as being useful, obvious, and simple.

(Failure cases? Well if the "announcer" dies then the backend won't get traffic routed to it. Just like if the backend were offline. And clearly if a backend is announced, but not receiving HTTP-requests it would be dropped as normal.)

If I get the time this evening I'll sit down and look at some load-balancer source code to see if any are written in such a way that I could add this "broadcast discovery" as a plugin/minor change.

Syndicated 2014-03-11 13:28:36 from Steve Kemp's Blog

Time to get back to my roots: Perl

Today I wrote a perl Test::RemoteServer module:

#!/usr/bin/perl -w -I.

use strict;
use warnings;

use Test::More tests => 4;
use Test::RemoteServer;

#
#  Ping Tests
#
ping_ok( "192.168.0.1",       "Website host is up: IPv4" );
ping6_ok( "www.steve.org.uk", "Website host is up: IPv6" );

#
#  Socket tests
#
socket_open( "ipv4.steve.org.uk", "2222", "OpenSSH is running" );
socket_closed( "ipv4.steve.org.uk", "22", "OpenSSH is not available on :22" );

I can see a lot of value in defining tests that are carried out against remote hosts - even if they're more basic than the kind of comprehensive testing you'd get via Custodian, Nagios, etc.

Being able to run "make test" and remotely probe services is cool.

Unfortunately I suspect the new-hotness is to couple the testing with your Chef, Puppet, CFengine, Slaughter, Ansible, etc, policies. That way you have two things:

  • A consistent way to define system-state.
  • A consistent way to test that the damn thing worked.

Coming to CPAN in the near future anyway, I can throw it up on Github in advance if there is any interest..

Syndicated 2014-03-07 21:46:03 from Steve Kemp's Blog

So I bought some new hardware, for audio purposes.

This week I received a logitech squeezebox radio, which is basically an expensive toy that allows you to listen to either "internet radio", or music streamed from your own PC via a portable device that accesses the network wirelessly.

The main goal of this purchase was to allow us to listen to media stored on a local computer in the bedroom, or living-room.

The hardware scans your network looking for a media server, so the first step is to install that:

The media-server has a couple of open ports; one for streaming the media, and one for a user-browsable HTML interface. Interestingly the radio-device shows up in the web-interface, so you can mess around with the currently loaded playlist from your office, while your wife is casually listening to music in the bedroom. (I'm not sure if that's a feature or not yet ;)

Although I didn't find any alternative server-implementations I did find a software-client which you can use to play music from the central server - slimp3slave - and again you can push playlists, media, etc, to this.

My impressions are pretty positive; the device was too expensive, certainly I wouldn't buy two, but it is functional. The user-interface is decent, and the software being available and open is a big win.

Downsides? No remote-control for the player, because paying an additional £70 is never going to happen, but otherwise I can't think of anything.

(Shame the squeezebox product line seems to have been cancelled (?))

Procmail Alternatives?

Although I did start hacking a C & Lua alternative, it looks like there are enough implementations out there that I don't feel so strongly any more.

I'm working in a different way to most people, rather than sort mails at delivery time I'm going to write a trivial daemon that will just watch ~/Maildir/.Incoming, and move mails out of there. That means that no errors will cause mail to be lost at SMTP/delivery time.

I'm going to base my work on Email::Filter since it offers 90% of the primitives I want. The only missing thing is the ability to filter mails via external commands which has now been reported as a bug/omission.

Syndicated 2014-03-06 15:48:14 from Steve Kemp's Blog

Replacing ugly things would save the world many hours

There are some tools that we use daily, whether we realize it or not, that are unduly ugly. Over time you learn to use them and you forget just how hard they are to learn, and you take it for granted.

Today I had to guide somebody through using procmail, and I'd forgotten how annoying it is.

In brief I use procmail in three ways, each of which I had to document:

  • Run a command, given a new email, and replace the original email with the output of that command.
  • Run a command, silently. Just for fun.
  • Match a regular expression on a header-field, and file accordingly.
    • Later extended to matching regexps on multiple headers. ("AND" + "OR" )

There are some projects that are too entrenched to ever be replaced ("make", I'm looking at you), but procmail? I reckon there's a chance a replacement would be useful, quickly.

Then again, maybe I'm biased.

Syndicated 2014-03-05 18:56:11 from Steve Kemp's Blog

3 Mar 2014 (updated 3 Mar 2014 at 11:17 UTC) »

Anti-social coding

I get all excited when I load up Github's front-page and see something like:

"robyn has forked skx/xxx to robyn/xxx"

I wonder what they will do, what changes do they have in mind?

Days pass, and no commits happen.

Anti-social coding: Cloning the code, I guess in case I delete my repository, but not intending to make any changes.

Syndicated 2014-03-03 10:11:50 (Updated 2014-03-03 11:17:35) from Steve Kemp's Blog

Some direction, some distraction

It seems that several people replied to the effect that they would pay people to take care of applying security updates, or even configuring adhoc things such as wikis, graphite, and MySQL.

Not enough people to rely upon, but perhaps there is scope for remote stuff being done in exchange for folding-money. (Of course some of those that replied are in foreign countries which makes receiving payment an annoyance, that's a separate problem though.)

Food for thought.

In the meantime I've settled into my use of lighttpd, which I've recently migrated to.

One interesting thing is that you can set your own "Server Name" directive:

# Set server name/version
server.tag = "lighttpd/(steve)"

This value is used by mod_dirlisting, so for example if you examine a directory which doesn't contain an index.html file you see the server-name. Cute.

Well cute unless, or until, somebody sets:

# Set server name/version
server.tag = "<script>alert(3)</script>"

That does indeed show javascript to all your visitors. Not a security problem itself, as you need to be root on the remote site. If you're root in the remote server you could just modify the actual HTML pages being served to include your javascript. That said it's a little icky.

The following patch avoids the issue:

--- mod_dirlisting.c.org	2014-02-26 00:14:43.296373275 +0000
+++ mod_dirlisting.c	2014-02-26 00:16:28.332371547 +0000
@@ -618,7 +618,7 @@
 		} else if (buffer_is_empty(con->conf.server_tag)) {
 			buffer_append_string_len(out, CONST_STR_LEN(PACKAGE_DESC));
 		} else {
-			buffer_append_string_buffer(out, con->conf.server_tag);
+                        buffer_append_string_encoded(out, CONST_BUF_LEN(con->conf.server_tag), ENCODING_HTML);
 		}

 		buffer_append_string_len(out, CONST_STR_LEN(

Syndicated 2014-02-27 13:21:01 from Steve Kemp's Blog

What do you pay for, and what would you pay for?

There are times when I consider launching my own company again, most often when it is late at night and the inpetitude of so many other companies gets me too worked up. Then I sit back and think about details and write it off.

I've worked for myself in the past a couple of times, and each time it was both more fun and more difficult than expected. Getting a couple of clients is usually easy, getting a ten more is common, but getting "many" is hard and getting "lots" is something I've never done - lots of users for free sites though, along with the associated support burdon!

So the though dies away once I sit down and work out the net profit I'd need to live. My expenses are low, so let us pretend I can easily live on £1000 a month. So the "company" has to make more than that, to cover costs, but perhaps not much.

Pretend you were offering DNS hosting you'd probably be able to implement that easily on, say 10, virtual machines, net of £150 a month. Imagine clients pay £5 for an unlimited number of domains that means you need to have 1000+150/5 = 230 clients. Not impossible, but also not easy.

Pretend instead you're offering backup space, and the numbers get bigger because disk is expensive. Again getting some users would be easy, but getting lots would be hard because your competition is dropbox, skydrive, etc, etc.

Once you start thinking of "ideas" they come easily, but the hard part is being realistic about what people would pay for. As always the idea is the easy part, the execution is the hardest part. Realistically if I were to be desperate to work for myself at short notic I'd do the obvious thing - I'd buy a pair of ladders, a bucket, and clean windows. Low overheads, reasonable demand, and I'd be both "fit" and "outdoors".

When it comes to paying for online services off the top of my head I personally pay for maybe two things, both of them niche (although profitable for their providers I'm sure), and I know many people who live on the internet but pay for nothing.

For example I'm a VIP member of an online modeling community, which in theory allows me a higher chance of persuading interesting people to pose for me.

In practice the turnover on those sites is immense. Lots of cute boys and girls hear constantly "You're so pretty, you should be a model", which is true in perhaps 1% of cases, and the net result is you have a few hard working people who do good things day in day out, and many flighty teenagers who'll pose for two-three people, and then never do it again because they realise it is neither glamourous nor easy money.

Two things I've semi-serously considered recently where hosted "status pages", and hosted "domain parking", but both have many competitors and both I can see a) some people would pay for but b) not very many.

I suspect there is no universal "I'd pay for this" online service hwich is both competition free and genuinely trivial to setup, but I'd be curious to see what people are missing, and even more curious to see what people do pay for.

Syndicated 2014-02-25 14:30:33 from Steve Kemp's Blog

Two minor toys ..

Two minor things:

graphite_send

A simple shell-script to submit metrics to a graphite server, extensible via local plugins, but covers the obvious metrics by default.

Metrics are submitted via simple calls to netcat.

Trivial, but much more lightweight than collectd and similar.

HTML::Emoji

A perl module for converting HTML like "&ltp>:smile:</p>" into something graphical.

This was written for my markdown sharing site, but is pretty fun.

The konami-code page demonstrates usage.

(This parses the HTML so it won't transform attributes, ids, or anything that isn't in the "text" part of any HTML input.)

The graphite sending script is perhaps the most useful, but at the same time it feels too small to be a package of its own. I'm tempted to bundle it up into my sysadmin-util collection, but I can't quite decide if it belongs there either.

Syndicated 2014-02-23 23:36:51 from Steve Kemp's Blog

Changing my stack ..

For the past few years I've hosted all my websites in a "special" way:

  • Each website runs under its own UID.
  • Each website runs a local thttpd / webserver.
  • Each server binds to localhost, on a high-port.
    • My recipe is that the port of the webserver for user "foo" is "$(id -u foo)".
  • On the front-end I have a proxy to route connections to the appropriate back-end, based on the Host header.

The webserver I chose initially was thttpd, which gained points because it was small, auditable, and simple to launch. Something like this was my recipe:

#!/bin/sh
exec thttpd -D -C /srv/steve.org.uk/thttpd.conf

Unfortunately thttpd suffers from a few omissions, most notably it doesn't support either "Keep-Alive", or "Compression" (i.e. gzip/deflate), so it would always be slower than I wanted.

On the plus side it was simple to use, supported CGI scripts, and served me well once I'd patched it to support X-Forwarded-For for IPv6 connections.

Recently I setup a server optimization site and was a little disappointed that the site itself scored poorly on Google's page-speed test. So I removed thttpd for that site, and replacing it with nginx. The end result was that the site scored 98/100 on Google's page-speed test. Progress. Unfortunately I couldn't do that globally because nginx doesn't support old-school plain CGI scripts.

So last night I removed both nginx and thttpd, and now every site on my box is hosted using lighttpd.

There weren't too many differences in the setup, though I had to add some rules to add caching for *.css, etc, and some of my code needed updating.

Beyond that today I've setup a dedicated docker host - which allows me to easily spin up containers. Currently I've got graphite monitoring for my random hosts, and a wordpress guest for plugin development/testing.

Now to go back to reading Off to be the wizard .. - not as good as Rick Cook's wizardry series (which got less good as time went on, but started off strongly), but still entertaining.

Syndicated 2014-02-22 11:32:59 from Steve Kemp's Blog

My pastebin will now run under docker.

I've updated my markdown-pastebin site, to be a little cleaner, and to avoid spidering issues.

Previously every piece of uploaded text received an incrementing integer to describe it - which meant it was trivially easy for others to see how many pieces of text had been uploaded, and to spider all past uploads (unless the user deleted them).

Now each fresh paste receives a random UUID to describe it, and this means spidering is no longer feasible.

I've also posted the source code to Gitub so folk can report bugs, fork, etc:

That source code now includes a Dockerfile which allows you to quickly and easily build your own container running this wonderful service, and launch it without worrying about trashing your server ;)

Anyway other than the user-interface overhaul it is still as functional, or not, as it used to be!

Syndicated 2014-02-17 18:59:33 from Steve Kemp's Blog

666 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!