Older blog entries for Stevey (starting at number 717)

The selfish programmer

Once upon a time I wrote a piece of software for scheduling the classes available to a college.

There was a bug in the scheduler: Students who happened to be named 'Steve Kemp' had a significantly higher chance (>=80% IIRC) of being placed in lessons where the class makeup was more than 50% female.

This bug was never fixed. Which was nice, because I spent several hours both implementing and disguising this feature.

I'm was a bad coder when I was a teenager.

These days I'm still a bad coder, but in different ways.

Syndicated 2014-07-25 13:16:54 from Steve Kemp's Blog

An alternative to devilspie/devilspie2

Recently I was updating my dotfiles, because I wanted to ensure that media-players were "always on top", when launched, as this suits the way I work.

For many years I've used devilspie to script the placement of new windows, and once I googled a recipe I managed to achieve my aim.

However during the course of my googling I discovered that devilspie is unmaintained, and has been replaced by something using Lua - something I like.

I'm surprised I hadn't realized that the project was dead, although I've always hated the configuration syntax it is something that I've used on a constant basis since I found it.

Unfortunately the replacement, despite using Lua, and despite being functional just didn't seem to gell with me. So I figured "How hard could it be?".

In the past I've written softare which iterated over all (visible) windows, and obviously I'm no stranger to writing Lua bindings.

However I did run into a snag. My initial implementation did two things:

  • Find all windows.
  • For each window invoke a lua script-file.

This worked. This worked well. This worked too well.

The problem I ran into was that if I wrote something like "Move window 'emacs' to desktop 2" that action would be applied, over and over again. So if I launched emacs, and then manually moved the window to desktop3 it would jump back!

In short I needed to add a "stop()" function, which would cause further actions against a given window to cease. (By keeping a linked list of windows-to-ignore, and avoiding processing them.)

The code did work, but it felt wrong to have an ever-growing linked-list of processed windows. So I figured I'd look at the alternative - the original devilspie used libwnck to operate. That library allows you to nominate a callback to be executed every time a new window is created.

If you apply your magic only on a window-create event - well you don't need to bother caching prior-windows.

So in conclusion :

I think my code is better than devilspie2 because it is smaller, simpler, and does things more neatly - for example instead of a function to get geometry and another to set it, I use one. (e.g. "xy()" returns the position of a window, but xy(3,3) sets it.).

kpie also allows you to run as a one-off job, and using the simple primitives I wrote a file to dump your windows, and their size/placement, which looks like this:

shelob ~/git/kpie $ ./kpie --single ./samples/dump.lua
-- Screen width : 1920
-- Screen height: 1080
..
if ( ( window_title() == "Buddy List" ) and
     ( window_class() == "Pidgin" ) and
     ( window_application() == "Pidgin" ) ) then
     xy(1536,24 )
     size(384,1032 )
     workspace(2)
end
if ( ( window_title() == "feeds" ) and
     ( window_class() == "Pidgin" ) and
     ( window_application() == "Pidgin" ) ) then
     xy(1,24 )
     size(1536,1032 )
     workspace(2)
end
..

As you can see that has dumped all my windows, along with their current state. This allows a simple starting-point - Configure your windows the way you want them, then dump them to a script file. Re-run that script file and your windows will be set back the way they were! (Obviously there might be tweaks required.)

I used that starting-point to define a simple recipe for configuring pidgin, which is more flexible than what I ever had with pidgin, and suits my tastes.

Bug-reports welcome.

Syndicated 2014-07-21 14:30:14 from Steve Kemp's Blog

Did you know xine will download and execute scripts?

Today I was poking around the source of Xine, the well-known media player. During the course of this poking I spotted that Xine has skin support - something I've been blissfully ignorant of for many years.

How do these skins work? You bring up the skin-browser, by default this is achieved by pressing "Ctrl-d". The browser will show you previews of the skins available, and allow you to install them.

How does Xine know what skins are available? It downloads the contents of:

NOTE: This is an insecure URL.

The downloaded file is a simple XML thing, containing references to both preview-images and download locations.

For example the theme "Sunset" has the following details:

  • Download link: http://xine.sourceforge.net/skins/Sunset.tar.gz
  • Preview link: http://xine.sourceforge.net/skins/Sunset.png

if you choose to install the skin the Sunset.tar.gz file is downloaded, via HTTP, extracted, and the shell-script doinst.sh is executed, if present.

So if you control DNS on your LAN you can execute arbitrary commands if you persuade a victim to download your "corporate xine theme".

Probably a low-risk attack, but still a surprise.

Syndicated 2014-07-19 20:48:25 from Steve Kemp's Blog

So what can I do for Debian?

So I recently announced my intention to rejoin the Debian project, having been a member between 2002 & 2011 (inclusive).

In the past I resigned mostly due to lack of time, and what has changed is that these days I have more free time - primarily because my wife works in accident & emergency and has "funny shifts". This means we spend many days and evenings together, then she might work 8pm-8am for three nights in a row, which then becomes Steve-time, and can involve lots of time browsing reddit, coding obsessively, and watching bad TV (currently watching "Lost Girl". Shades of Buffy/Blood Ties/similar. Not bad, but not great.)

My NM-progress can be tracked here, and once accepted I have a plan for my activities:

  • I will minimally audit every single package running upon any of my personal systems.
  • I will audit as many of the ITP-packages I can manage.
  • I may, or may not, actually package software.

I believe this will be useful, even though there will be limits - I've no patience for PHP and will just ignore it, along with its ecosystem, for example.

As progress today I reported #754899 / CVE-2014-4978 against Rawstudio, and discussed some issues with ITP: tiptop (the program seems semi-expected to be installed setuid(0), but if it is then it will allow arbitrary files to be truncated/overwritten via "tiptop -W /path/to/file"

(ObRandom still waiting for a CVE identifier for #749846/TS-2867..)

And now sleep.

Syndicated 2014-07-16 20:49:02 from Steve Kemp's Blog

A brief twitter experiment

So I've recently posted a few links on Twitter, and I see followers clicking them. But also I see random hits.

Tonight I posted a link to http://transient.email/, a domain I use for "anonymous" emailing, specifically to see which bots hit the URL.

Within two minutes I had 15 visitors the first few of which were:

IP User-Agent Request
199.16.156.124 Twitterbot/1.0; GET /robots.txt
199.16.156.126 Twitterbot/1.0; GET /robots.txt
54.246.137.243 python-requests/1.2.3 CPython/2.7.2+ Linux/3.0.0-16-virtual HEAD /
74.112.131.243 Mozilla/5.0 (); GET /
50.18.102.132 Google-HTTP-Java-Client/1.17.0-rc (gzip) HEAD /
50.18.102.132 Google-HTTP-Java-Client/1.17.0-rc (gzip) HEAD /
199.16.156.125 Twitterbot/1.0; GET /robots.txt
185.20.4.143 Mozilla/5.0 (compatible; TweetmemeBot/3.0; +http://tweetmeme.com/) GET /
23.227.176.34 MetaURI API/2.0 +metauri.com GET /
74.6.254.127 Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp); GET /robots.txt

So what jumps out? The twitterbot makes several requests for /robots.txt, but never actually fetches the page itself which is interesting because there is indeed a prohibition in the supplied /robots.txt file.

A surprise was that both Google and Yahoo seem to follow Twitter links in almost real-time. Though the Yahoo site parsed and honoured /robots.txt the Google spider seemed to only make HEAD requests - and never actually look for the content or the robots file.

In addition to this a bunch of hosts from the Amazon EC2 space made requests, which was perhaps not a surprise. Some automated processing, and classification, no doubt.

Anyway beer. It's been a rough weekend.

Syndicated 2014-07-13 19:08:04 from Steve Kemp's Blog

A partial perl-implementation of Redis

So recently I got into trouble running Redis on a host, because the data no-longer fits into RAM.

As an interim measure I fixed this by bumping the RAM allocated to the guest, but a real solution was needed. I figure there are three real alternatives:

  • Migrate to Postgres, MySQL, or similar.
  • Use an alternative Redis implementation.
  • Do something creative.

Looking around I found a couple of Redis-alternatives, but I was curious to see how hard it would be to hack something useful myself, as a creative solution.

This evening I spotted Protocol::Redis, which is a perl module for decoding/encoding data to/from a Redis server.

Thinking "Ahah" I wired this module up to AnyEvent::Socket. The end result was predis - A perl-implementation of Redis.

It's a limited implementation which stores data in an SQLite database, and currently has support for:

  • get/set
  • incr/decr
  • del/ping/info

It isn't hugely fast, but it is fast enough, and it should be possible to use alternative backends in the future.

I suspect I'll not add sets/hashes, but it could be done if somebody was keen.

Syndicated 2014-07-11 20:36:53 from Steve Kemp's Blog

Blogspam moved, redis alternatives being examined

As my previous post suggested I'd been running a service for a few years, using Redis as a key-value store.

Redis is lovely. If your dataset will fit in RAM. Otherwise it dies hard.

Inspired by Memcached, which is a simple key=value store, redis allows for more operations: using sets, using hashes, etc, etc.

As it transpires I mostly set keys to values, so it crossed my mind last night an alternative to rewriting the service to use a non-RAM-constrained service might be to juggle redis out and replace it with something else.

If it were possible to have a redis-compatible API which secretly stored the data in leveldb, sqlite, or even Berkley DB, then that would solve my problem of RAM-constraints, and also be useful.

Looking around there are a few projects in this area nds fork of redis, ssdb, etc.

I was hoping to find a Perl Redis::Server module, but sadly nothing exists. I should look at the various node.js stub-servers which exist as they might be easy to hack too.

Anyway the short version is that this might be a way forward, the real solution might be to use sqlite or postgres, but that would take a few days work. For the moment the service has been moved to a donated guest and has 2Gb of RAM instead of the paltry 512Mb it was running on previously.

Happily the server is installed/maintained by my slaughter tool so reinstalling took about ten minutes - the only hard part was migrating the Redis-contents, and that's trivial thanks to the integrated "slave of" support. (I should write that up regardless though.)

Syndicated 2014-07-10 08:45:13 from Steve Kemp's Blog

What do you do when your free service is too popular?

Once upon a time I setup a centralized service for spam-testing blog/forum-comments in real time, that service is BlogSpam.net.

This was created because the Debian Administration site was getting hammered with bogus comments, as was my personal blog.

Today the unfortunate thing happened, the virtual machine this service was running on ran out of RAM and died - The redis-store that holds all the state has now exceeded the paltry 512Mb allocated to the guest, so OOM killed it.

So I'm at an impasse - I either recode it to use MySQL instead of Redis, or something similar to allow the backing store to exceed the RAM-size, or I shut the thing down.

There seems to be virtually no liklihood somebody would sponsor a host to run the service, because people just don't pay for this kind of service.

I've temporarily given the guest 1Gb of RAM, but that comes at a cost. I've had to shut down my "builder" host - which is used to build Debian packages via pbuilder.

Offering an API, for free, which has become increasingly popular and yet equally gets almost zero feedback or "thanks" is a bit of a double-edged sword. Because it has so many users it provides a better service - but equally costs more to run in terms of time, effort, and attention.

(And I just realized over the weekend that my Flattr account is full of money (~50 euro) that I can't withdraw - since I deleted my paypal account last year. Ooops.)

Meh.

Happy news? I'm avoiding the issue of free service indefinitely with the git-based DNS product which was covering costs and now is .. doing better.

Syndicated 2014-07-08 09:56:59 from Steve Kemp's Blog

And so it begins ...

1. This weekend I will apply to rejoin the Debian project, as a developer.

2. In the meantime I've been begun releasing some some code which powers the git-based DNS hosting site/service.

3. This is the end of my list.

4. I lied. This is the end of my list. Powers of two, baby.

Syndicated 2014-07-03 21:00:04 from Steve Kemp's Blog

Slowly releasing bits of code

As previously mentioned I've been working on git-based DNS hosting for a while now.

The site was launched for real last Sunday, and since that time I've received enough paying customers to cover all the costs, which is nice.

Today the site was "relaunched", by which I mean almost nothing has changed, except the site looks completely different - since the templates/pages/content is all wrapped up in Bootstrap now, rather than my ropy home-made table-based layout.

This week coming I'll be slowly making some of the implementation available as free-software, having made a start by publishing CGI::Application::Plugin::AB - a module designed for very simple A/B testing.

I don't think there will be too much interest in most of the code, but one piece I'm reasonably happy with is my webhook-receiver.

Webhooks are at the core of how my service is implemented:

  • You create a repository to host your DNS records.
  • You configure a webhook to be invoked when pushing to that repository.
  • The webhook will then receive updates, and magically update your DNS.

Because the webhook must respond quickly, otherwise github/bitbucket/whatever will believe you've timed out and report an error, you can't do much work on the back-end.

Instead I've written a component that listens for incoming HTTP POSTS, parses the body to determine which repository that came from, and then enqueues the data for later processing.

A different process will be constantly polling the job-queue (which in my case is Redis, but could be beanstalkd, or similar. Hell even use MySQL if you're a masochist) and actually do the necessary magic.

Most of the webhook processor is trivial, but handling different services (github, bitbucket, etc) while pretending they're all the same is hard. So my little toy-service will be released next week and might be useful to others.

ObRandom: New pricing unveiled for users of a single zone - which is a case I never imagined I'd need to cover. Yay for A/B testing :)

Syndicated 2014-06-29 13:37:41 from Steve Kemp's Blog

708 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!