Older blog entries for Stevey (starting at number 563)

Bye-bye AOL

Today I took that final step:

touch /srv/_global_/blacklisted/domains/aol.com

I remember, even a couple of years ago, I had friends who would mail me from their @aol.com email addresses. These days people have moved on.

The two single biggest mail-providers I see in terms of spam are:

  • @yahoo.com - 832 in the past nine days.
  • @aol.com - 242 in the past nine days.

I'd like to drop @yahoo.com but I have some (misguided) friends who continue to use it to mail me. I might start dropping non-friend mails from that domain, but that's a bigger job.

Yes, this is a dull entry. Sorry. My existing bathroom has been ripped out and turned into this as a stepping stone into its new incarnation. I'm trapped in my office. Dust almost everywhere. Noise everywhere else.

ObQuote: "There can be no understanding between the hand and the brain unless the heart acts as mediator. " - Metropolis

Syndicated 2012-04-09 13:38:01 from Steve Kemp's Blog

So I want a backup solution

I look after a lot of systems, and most of them want identical and simple backups taking of their filesystems. Currently I use backup2l which works but suffers from a couple of minor issues.

In short I want to take a full filesystem backup (i.e. Backup "/"). I wish to only exclude a few directories and mounted filesystems.

So my configuration looks like this:

# List of directories to make backups of.
# All paths MUST be absolute and start with a '/'!
SRCLIST=(  / /boot -xdev )

# The following expression specifies the files not to be archived.
SKIPCOND=( -path '/var/backups/localhost' -o -path '/var/run/' -o \
    -path '/tmp'  -o -path '/var/tmp' \
    -o -path '/dev' -o -path '/spam' \
    -o -path '/var/spool/' )

The only surprising thing here is that I abuse the internals of backup2l because I know that it uses "find" to build up a list of files - so I sneakily add int "-xdev" to the first argument. This means I don't accidentally backup any mounted gluster filesystem, mounted MySQL binary/log mounts, etc.

backup2l then goes and does its jobs. It allows me to define things to run before and after the backup runs via code like this:

# This user-defined bash function is executed before a backup is made
   if [ -d /etc/backup2l/pre.d/ ]; then
      run-parts /etc/backup2l/pre.d/

So what is my gripe? Well I get a daily email, per-system, which shows lots of detail - but the key thing. The important thing. The thing I care about more than anything else, the actual "success" or "fail" result is only discoverable by reading the mail.

If the backup fails, due to out of disk, I won't know unless I read the middle of the mail.

If the pre/post-steps fail I won't know unless I examine the output.

As I said to a colleague today in my view the success or failure of the backup is the combination of each of three distinct steps:

  • pre-backup jobs.
  • backup itself
  • post-backup jobs.

If any of the three fail I want to know. If they succeed then ideally I don't want a mail at all - but if I get one it should have:

Subject: Backup Success - $(hostname) - $(date)

So I've looked around at programs such as backup-ninja, backup-manager and they seem similar. It is a shame as I mostly like backup2l, but in short I want to do the same thing on about 50-250 hosts:

  • Dump mysql, optionally.
  • Dump postgresql, optionally.
  • Dump the filesystem. Incrementals are great, but full copies are probably tolerable.
  • Rsync those local filesystem backups to a remote location.

In my case it is usually the rsync-step that fails. Which is horrific if you don't notice (quota exceeded. connection reset by peer. etc). The local backups are good enough for 95% of recovery times - but if the hardware is fried having the backups be available, albeit slowly, is required.

Using GNU Tar incrementally is trivial. If it weren't such a messy program I'd probably be inclined to hack on backup2l - but in 2012 I can't believe I need to.

(Yes, backuppc rocks. So does duplicity. So does amanda. But they're not appropriate here. Sadly.)

ObQuote: "Oh, I get it. I see now. You've been training for two years to take me out, and now here I am. Whew! " - Blade II - An example of a rare breed, a sequel that doesn't suck. No pun intended.

Syndicated 2012-04-05 21:12:14 from Steve Kemp's Blog

So I have a new camera. Again.

Until recently I've had a Canon EOS 1000D, my starter-camera, and a Canon EOS 40D which is my real-camera.

The 40D is older, but it probably counts as a "semi-pro" body, albeit an old mid-range one. From an image size point of view there isn't too much to tell them apart - both produce 10MP images. But from a hardware and ease of use sense the 40D has several key features which made it a compelling upgrade:

  • Dual controls. So we can use one wheel for shutter speed, and one for aperture size.
  • Better feeling body, which is slightly larger and more solid.
  • Top display for instantly obvious settings.

Anyway both these cameras have been my friends for the past year or two, although I did buy a toy camera for those times when I didn't want to carry the DLSR around.

I've made sure I only bought "posh" lens, including the fabulous and horrifically expensive 70-200 f/2.8 MK 2 lens (Just short of £2000) and I'd been wanting to use those on a full-frame camera.

Now it is upgrade time once more and I've just bought the EOS 5D MK II - a full-frame camera which means I don't have to worry about crop-factors any more.

So far I've only had it a couple of days but I'm in love. The output images are 21MP so I get far fewer to a (CF) card. But the detail is sublime.

Future portraits and photos of people will be wonderful - although I hope they already are to a large degree!

This upgrade was a hard choice. The 5D is a full-frame, but a little slow. (Faster than my 40D by a hairs-bredth) The alternative would have been a 7D which is fast, and wonderful, but still uses a cropped sensor. Given that I have fast lenses and don't do sports (often) the 5D seemed like the sanest approach.

For my reference - my Canon serial numbers:

EOS 1000D 1780312242
EOS 40D 1230734041
EOS 5D MK II 4131916951

ObQuote: "Courage is only required when facing that which you fear. " -Stargate: The Ark Of Truth

Syndicated 2012-03-25 21:16:58 from Steve Kemp's Blog

My code makes it into GNU Screen, and now you can use it. Possibly.

Via Axel Beckert I learned today that GNU Screen is 25 years old, and although development is slow it has not ceased.

Back in 2008 I started to post about some annoyances with GNU Screen. At the time I posted a simple patch to implement the unbindall primitive. I posted some other patches and fixed a couple of bugs, but although there was some positive feedback initially over time that ceased completely. Regretably I didn't have the feeling there was the need to maintain a fork properly, so I quietly sighed, cried, and ceased.

In 2009 my code was moved upstream into the GNU Screen repository (+documentation update).

We're now in 2012. It looks like there might be a stable release of GNU Screen in the near future, which makes my code live "for real", but in the meantime the recent snapshot upload to Debian Experimental makes it available to the brave.

2008 - 2012. Four years to make my change visible to end-users. If I didn't use screen every day, and still have my own local version, I'd have forgotten about that entirely.

Still I guess this makes today a happy day!


ObQuote: "Thanks. For a while there I thought you were keeping it a secret. " - Escape To Victory

Syndicated 2012-03-21 12:24:53 from Steve Kemp's Blog

Happy birthday to me

Recently I accidentally flooded Planet Debian with my blog feed. This was an accident caused by some of my older blog entries not having valid "Date:" headers. (I use chronicle which parses simple text files to build a blog, and if there is no Date: header present in entries it uses the CTIME of the file(s).)

So why did my CTIMEs get lost? Short version I had a drive failure and a PSU failure which lead to me rebuilding a few things and cloning a fresh copy of my blog to ~/hg/blog/.

My host is now once again OK, but during the pain the on-board sound started to die. Horribly crackly and sounding bad. I figure the PSU might have caused some collateral damage, but so far thats the only sign I see.

I disabled the on-board sound and ordered a cheap USB sound device which now provides me with perfect sound under the Squeeze release of Debian GNU/Linux.

In the past I've ranted about GNU/Linux sound. So I think it is only fair to say this time things worked perfectly - I plugged in the device, it was visible in the output of dmesg, and /proc/asound/cards and suddenly everything just worked. Playing music (mpd + sonata) worked immediately, and when I decided to try playing a movie with xine just for fun sound was mixed appropriately - such that I could hear both "song" + "movie" at the same time. Woo.

(I'm not sure if I should try using pulse-audio, or similar black magic. Right now I've just got ALSA running.)

Anyway as part of the re-deployment of my desktop I generated and pass-phrased a new SSH key, and then deployed that with my slaughter tool. My various websites all run under their own UID on my remote host, and a reverse-proxy redirects connections. So far example I have a Unix "s-stolen" user for the site stolen-souls.com, a s-tasteful user for the site tasteful.xxx, etc. (Right now I cannot remember why I gave each "webserver user" an "s-" prefix, but it made sense at the time!)

Anyway once I'd fixed up SSH keys I went on a spree of tidying up and built a bunch of meta-packages to make it a little more straightforward to re-deploy hosts in the future. I'm quite pleased with the way those turned out to be useful.

Finally I decided to do something radical. I installed the bluetile window manager, which allows you to toggle between "tiling" and "normal" modes. This is my first foray into tiling window managers, but it seems to be going well. I've got the hang of resizing via the keyboard and tweaked a couple of virtual desktops so I can work well both at home and on my work machine. (I suspect I will eventually migrate to awesome, or similar, this is very much a deliberate "ease myself into it" step.)

ObQuote: "Being Swedish, the walk from the bathroom to her room didn't need to be a modest one. " - Cashback.

Syndicated 2012-03-10 12:54:21 from Steve Kemp's Blog

5 Mar 2012 (updated 7 Mar 2012 at 14:06 UTC) »

Today I migrated from 32-bit to 64-bit, in-place

This evening I sat down and migrated my personal virtual machine from a 32-bit installation of Debian GNU/Linux to a 64-bit installation.

I've been meaning to make this change for a good few months, but it took me until this evening until I decided it was as good a time as any.

Mostly the process is painless:

  • Ensure you have a 64-bit kernel, with support for 32-bit binaries too.
  • Install the 32-bit compatibility libraries, such that your old binaries work.
  • Overwrite your binaries and libraries in-place so you have a 64-bit base system.
  • Patch it up afterwards.

I overwrote a lot of the libraries and binaries on the system such that I had a working 64-bit apt-get, dpkg, sash, etc, and associated libraries. Then once I had that I could use those tools to pull the resto of the system up to date.

One thing I hadn't counted on is that I needed to have a 64-bit version of bzip such that "apt-get update" didn't complain about errors. I suspect I could have fixed that by re-configuring my system to disable compression. Still it was easily solved.

Along the way I also shot myself in the foot by having a local caching DNS resolver, listening on, which broke. With no DNS I couldn't use apt-get - but once the problem was identified it was trivial to fix.

Anyway all seems OK now. My websites are up, email is flowing and I guess anything else can wait until the morning.

ObQuote: "Somebody's coming up. Somebody serious." - Leon

Syndicated 2012-03-05 01:12:44 (Updated 2012-03-07 14:06:59) from Steve Kemp's Blog

23 Feb 2012 (updated 7 Mar 2012 at 01:08 UTC) »

Symbiosis is wonderful


Symbiosis is the collective name given to a group of Debian GNU/Linux packages which implement simple virtual hosting. It is developed by my employers Bytemark.

Symbiosis is basically a collection of configuration snippets, code, and libraries which works to offer virtual hosting in a reliable consistent and easy to understand fashion.

You implement hosting for a new domain by merely creating a directory tree. So for example you might configure the hosting for the domain example.com by running:

mkdir -p /srv/example.com/public/htdocs
echo "hello, world" >> /srv/example.com/public/htdocs/index.html

mkdir -p /srv/example.com/mailboxes/webmaster
echo "super-secret" > /srv/example.com/mailboxes/webmaster/password

mkdir -p /srv/example.com/config
echo "3l33t" > /srv/example.com/mailboxes/config/ftp-password

There you are, now http://www.example.com/ and http://example.com/ will work, and you may login to check mail with the email address webmaster@example.com via POP3, IMAP, IMAPS, or POP3S. Finally you can FTP with username example.com and be dropped into the public directory.

The mail handling is very flexible, and the webhosting supports wonderful things.

I don't generally talk about work-stuff explicitly, but we've just made a major new release of the Symbiosis system such that it works upon Squeeze and has lots of IPv6 support out of the box. (Email, DNS, HTTP, Firewalling, FTP etc.)

All in all it is simple, well-documented, and open-source with a reasonably large user-base. More external testers, users, and developers would be a wonderful thing..

Mutt Mailboxes & Idle Hooks?

Mutt is wonderful but I'm starting to get annoyed by its lack of auto-mailbox discovery.

Assuming you use procmail you might deliver mail to ~/Maildir/.foo/ and mutt won't notice that if the directory is created once it starts.

(This is because generally mailboxes are defined via "mailboxes =one =two ..", even if you use a shell snippet it won't get updated unless you re-read configuration, or re-exec mutt).

I wish it were possible to use inotify/dnotify/something magic such that everything beneath ~/Maildir would just work.

(Re-reading mailboxes manually is one solution but it is .. nasty?)

I'm thinking that of all the possible solutions one of the most potentially interesting would be to define a new hook: "idle-hook command .."

That way "command" would be executed every time the client is idle. (This is a distinct state unrelated to IMAP IDLE times.)

Nopte: There is already "mail_check" & "timeout" options. Even running a defined command immediately following the code for mail_check would be reasonable.

Reverse Proxy

I continue to use, love, and enjoy my node.js-based reverse HTTP proxy, and pub discussions seemed to suggest it is a great idea (due to flexibility) but it will never take on because people don't trust node.

I'm almost tempted to re-code it in LUA & C. But I can't help but think that would be a waste of time which would not increase adoption - after all most people use "simple" reverse proxies, and they are well suited by Apache, nginx, or even varnish.

Still no rush I suppose.

In more personal news after living in this flat for 7 years, or so, I'm getting a new bathroom designed and deployed. Good times.

In the meantime I've been steadily watching Stargate SG-1 having recently purchased a box-set of series 1-10. I've just started series six this evening, and I'm enjoying it a lot.

ObQuote: "You have been recruited by the Star League to defend the frontier against Xur and the Ko-Dan armada. " - The Last Starfighter (1984). First film I ever saw at a cinema as a child.

Syndicated 2012-02-23 19:24:21 (Updated 2012-03-07 01:08:53) from Steve Kemp's Blog

Some domains just don't learn

For the past few years the anti-spam system I run has been based on a simplified version of something I previously ran commercially.

Although the code is similar in intent there were both explicit feature removals, and simplifications made.

Last month I re-implimented domain-blacklisting - because a single company keeps ignoring requests to remove me.

So LinkedIn.com if you're reading this:

  • I've never had an account on your servers.
  • I find your junk mail annoying.
  • I suspect I'll join your site/service when hell freezes over.

I've also implemented TLD-blacklisting which has been useful.

TLD-blacklisting in my world is not about blocking mail from foo@bar.ph (whether in the envelope sender, or the from: header), instead it is about matching the reverse DNS of the connecting client.

If I recieve a connection from and the reverse DNS of that IP address matches, say, /\.sa$/i then I default to denying it.

My real list is longer, and handled via files:

steve@steve:~$ ls /srv/_global_/blacklisted/tld/ -C
ar  br  cn  eg  hr  in  kr  lv  mn  np  ph  ro  sg  tg  ua  ve  zw
aw  cc  cy  gm  hu  is  kz  ma  my  nu  pk  rs  sk  th  ug  vn
be  ch  cz  gr  id  it  lk  md  mz  nz  pl  ru  su  tr  uy  ws
bg  cl  ec  hk  il  ke  lt  mk  no  om  pt  sa  sy  tw  uz  za

On average I'm rejecting about 2500 messagse a day at SMTP-time, and 30 messages, or so, hit my SPAM folder after being filtered with CRM114 after being accepted for delivery. (They are largely from @hotmail and @yahoo, along with random compromised machines. The amount of times I see a single mail from a host with RDNS mysql.example.org is staggering.).

(Still looking forward to the development of Haraka, a node.js version of qpsmtpd.)

ObQuote: "Mr. Mystery Guest? Are you still there? " - Die Hard

Syndicated 2012-02-05 13:24:44 from Steve Kemp's Blog

So mega-upload is gone

So the site http://megaupload.com/ has been taken offline, amidst allegations of knowingly conducting in piracy.

There are probably a lot of legitimate users who have lost access to their uploaded files, even if they were offsite backups you can imagine a user owning a website which now has a million dead-links.

This reminds me of a conversation I overheard on Jon Dowlands blog - the summary is that he'd written a (useful) tool to extract attachments from Maildir folders and was wondering how to store and access those attachments. The upshot seemed to be magical URLs of the form:

  • https://file.example.com/sha1/509c2fe2eba509e93987c3024a74d74583c274bd

The comments covered an alternative which was hash:///sha1/xxxxxxxxxxxxxxxx, which then becomes close to the magnet:// schema.

I've not yet thought things through, but I can't help thinking that with the redundency already present in the internet we should be looking at non-server-specific links. Yes there are times right now when you might want to address a specific file on a specific server - but otherwise? Wouldn't it be nice if you could just access a file from "anywhere" which happened to have the right contents?

Already my nonporn-but-definitely-adult-site makes its images available as /img/$md5sum.jpg - and similarly the storage at the back-end of my random image upload site uses SHA1 hashes to store the actual files.

To make this more complete what we need is something that crawls the internet to find files by hash; then add support in browsers. Obviously this must be async and could introduce timing issues, but fundamentally it seems like a reasonable approach to the problem of a single host going offline.

(Consider what happens if imgur.com disappears. All those links would die, yet 99% of the images would still be available somewhere.)

I'm tempted to suggest microformat format but I need to consider the matter. Right now I'm going to immediately update my current image hosts to use, at the very least:

 <a href="/foo" rel="sha1:xxxxx md5sum:xxxx">
  <img src="foo.jpg" alt="img name">

The unfortunate thing is you cannot have a 'rel="xx"' attribute for an image. So you either have to encode it in the parent link, or add it to the alt attribute which is suboptimal.

ObQuote: "Now, they tell me I paid my debt to society." - Oceans Eleven (2001)

Syndicated 2012-01-21 12:42:37 from Steve Kemp's Blog

Some misc. updates


Today I made available a 3.2.0 kernel for my KVM guest which has a bastardised version of the PID hiding patch configured:

So now on my guest, as myself, I can only see this:

steve@steve:~$ ls -l /proc/ | egrep ' [0-9]+$'
dr-xr-xr-x  7 steve users          0 Jan 13 17:22 15150
dr-xr-xr-x  7 steve users          0 Jan 13 17:29 15739
dr-xr-xr-x  7 steve users          0 Jan 13 17:29 15740
lrwxrwxrwx  1 root  root          64 Jan 13 17:20 self -> 15739

Running as root I see the full tree:

steve:~#  ls -l /proc/ | egrep ' [0-9]+$'
total 0
dr-xr-xr-x  7 root        root                 0 Jan 13 17:20 1
dr-xr-xr-x  7 root        root                 0 Jan 13 17:20 1052
dr-xr-xr-x  7 root        root                 0 Jan 13 17:20 1086
dr-xr-xr-x  7 root        root                 0 Jan 13 17:20 1101
dr-xr-xr-x  7 root        root                 0 Jan 13 17:20 1104
dr-xr-xr-x  7 root        root                 0 Jan 13 17:21 1331
dr-xr-xr-x  7 pdnsd       proxy                0 Jan 13 17:21 14409
dr-xr-xr-x  7 root        root                 0 Jan 13 17:21 14519

This (obviously) affects output from top etc too. It is a neat feature which I think is worth having, but time will tell..


A long time ago I put together an Apache module which allowed the evaluation of security rules against incoming HTTP requests. mod_ifier was largely ignored by the world. But this week it did receive a little attention.

The recent rash of Hash Collision attacks inspired inspired a fork with parameter filtering. Neat.

Otherwise nothing too much to report - though I guess I didn't actually share the link to the RESTful file store I mentioned previously. Should you care you can find it here:

ObQuote: "I saw a man, he danced with his wife" - Chicago, Frank Sinatra

Syndicated 2012-01-13 17:33:46 from Steve Kemp's Blog

554 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!