Older blog entries for joey (starting at number 556)

what does docker.io run -it debian sh run?

When you type docker.io run -it debian sh, it goes off and gets "debian" and runs it. But what is in this "debian" image? How was it built?

The docker hub does not really say. All it tells us is this is a "(Semi) Official Debian base image" and that its sources.list uses http.debian.net for geolocation.

There's a link to https://github.com/dotcloud/stackbrew/blob/master/library/debian which in turn uses a very strange git repository, owned by Debian maintainer Tianon Gravi, that contains compressed tarballs of Debian: http://github.com/tianon/docker-brew-debian "Git is not a fan of what we're doing here."

The "source", such as it is, that is used to build this image consists of:

FROM scratch
ADD rootfs.tar.xz /
CMD ["/bin/bash"]


mkimage.sh -t tianon/debian:wheezy -d . debootstrap --variant=minbase --components=main --include=inetutils-ping,iproute wheezy http://http.debian.net/debian

I don't know where mkimage.sh is. And anyway, I have no reason to trust that this image is built the way it claims to be built. So, the question remains: What is in this image?

To find out, I did a debootstap --variant=minbase stable and diffed the entire docker debian image against it. The diff was 6738 lines, from which I found the following interesting differences.

added packages

The image has iputils-ping and netbase and iproute added. These are not in a minbase debootstrap, but are in a regular debootstrap. It's rather weird that the docker image is based on a minbase debootstrap, since this means they have to add back important stuff like this on an ad-hoc basis.

If the expectation is that an experienced Unix person who found it missing would say "What on earth is going on, where is 'foo'?", it must be an 'important' package. -- Debian Policy

apt hooks

DPkg::Post-Invoke { "rm -f /var/cache/apt/archives/*.deb /var/cache/apt/archives/partial/*.deb /var/cache/apt/*.bin || true"; };
APT::Update::Post-Invoke { "rm -f /var/cache/apt/archives/*.deb /var/cache/apt/archives/partial/*.deb /var/cache/apt/*.bin || true"; };

Dir::Cache::pkgcache "";
Dir::Cache::srcpkgcache "";

Acquire::Languages "none";

These are some strange modifications to apt's config. The intent is clearly to avoid wasting disk space, even at the expense of making apt slower (by disabling caches) and losing translations.

I am curious if apt might ever invoke the DPkg::Post-Invoke twice in an upgrade in which it runs dpkg twice. I'm also curious whether deleting /var/cache/apt/archives/lock could cause a problem.


dpkg is configured to use unsafe-io.


Linux viper 3.12.20-gentoo #1 SMP Sun May 18 12:36:24 MDT 2014 x86_64

Yes, that's "gentoo". Presumably this tells us something about the build host.


/usr/sbin/policy-rc.d contains "exit 101", which prevents daemons from being automatically started after they are installed. This may or may not be desirable, depending on what you're doing with docker.

It notably also prevents restarting running daemons in this container if they're upgraded for eg, a security fix. It would almost certianly be better if this script allowed restarting running daemons.


/sbin/initctl is diverted and replaced with /bin/true. This is a workaround for a bug in sysvinit; when upgraded inside a docker container it hangs while trying to run initctl.

missing devices

Several devices are missing, including /dev/full, /dev/null, /dev/zero, /dev/[u]random, /dev/shm, and /dev/loopN. Docker probably fixes some of these up when running the image, but it certianly works ok with an image with all these missing devices present, so I don't know the rationalle for omitting them.

(See also this bug)

some gpg thing is different

Binary files pure-debootstrap/etc/apt/trustdb.gpg and from-docker/etc/apt/trustdb.gpg differ

Oh well, that can't be important.. Or can it? I did not check.


I would hardly consider this to be an "(Semi) Official Debian image". Some of the changes are quite dubious. The build environment is not Debian. There is no guarantee you'll get the same image I examined. Diffing thousands of lines of filesystem changes is not particularly fun or reliable way to spot accidental or malicious changes.

I'd recommend only trusting docker images you build yourself. I have some docker images published somewhere that are built with 100% straight debootstrap with no modifications (and even an armel image that can be used on an x86 system thanks to qemu). But I'm not going to link to them, because again, you should only trust docker images you built yourself. To help increase your mistrust of me, I present this IRC snippet:

I'll bet I could publish an image that just did a killall5 as root on startup and get plenty of people to nuke their container hosts

Here are some ideas for things Debian could do to improve this:

  • Make a package that can build docker images of Debian, in a fully reproducible fashion. Ie, same versions of debs in, same byte stream out.
  • If it makes sense for the docker image to not contain all the packages in a standard debootstrap, or to contain other packages, write down the rationalle for this, and make a --variant=docker.
  • Make a package that provides appropriate tweaks for Debian in a container. This might include a policy-rc.d that allows restarting daemons on upgrade if they're already running in the container, and otherwise prevents running daemons.
  • Make a low-disk-space package that eg, prevents apt from caching debs.
  • Provide some way to verify, through gpg signatures, that docker has pulled an actual trusted image and not some https-MITMed thing.

PS, if this wasn't enough fun, just consider the tweaks made to the "Debian" images on all the VPS hosts out there.

Syndicated 2014-06-19 15:58:26 from see shy jo

how I wrote init by accident

I wrote my own init. I didn't mean to, and in the end, it took 2 lines of code. Here's how.

Propellor has the nice feature of supporting provisioning of Docker containers. Since Docker normally runs just one command inside the container, I made the command that docker runs be propellor, which runs inside the container and takes care of provisioning it according to its configuration.

For example, here's a real live configuration of a container:

        -- Exhibit: kite's 90's website.
        , standardContainer "ancient-kitenet" Stable "amd64"
                & Docker.publish "1994:80"
                & Apt.serviceInstalledRunning "apache2"
                & Git.cloned "root" "git://kitenet-net.branchable.com/" "/var/www"
                        (Just "remotes/origin/old-kitenet.net")

When propellor is run inside this container, it takes care of installing apache, and since the property states apache should be running, it also starts the daemon if necessary.

At boot, docker remembers the command that was used to start the container last time, and runs it again. This time, apache is already installed, so propellor simply starts the daemon.

This was surprising, but it was just what I wanted too! The only missing bit to make this otherwise entirely free implementation of init work properly was two lines of code:

                void $ async $ job reapzombies
        reapzombies = void $ getAnyProcessStatus True False

Propellor-as-init also starts up a simple equalivilant of rsh on a named pipe (for communication between the propellor inside and outside the container), and also runs a root login shell (so the user can attach to the container and administer it). Also, running a compiled program from the host system inside a container, which might use a different distribution or architecture was an interesting challenge (solved using the method described in completely linux distribution-independent packaging). So it wasn't entirely trivial, but as far as init goes, it's probably one of the simpler implementations out there.

I know that there are various other solutions on the space of an init for Docker -- personally I'd rather the host's systemd integrated with it so I could see the status of the container's daemons in systemctl status. If that does happen, perhaps I'll eventually be able to remove 2 lines of code from propellor.

Syndicated 2014-05-14 15:05:49 from see shy jo

who needs whiteboards when you have strange seed pods from the jungle


Discussing git-annex routing with Vince and Fernao. Might not look like much, but we seem to be close to cracking the most interesting problem with git-annex routing. I need to translate and read Vince's thesis and build some simulations..

(Seed pod, cup, camera = fixed node; mini brick = usb drive; leaves = data.)

Syndicated 2014-05-01 17:11:37 from see shy jo

radio brazil

Live radio broadcast going on in the tent while I teach Mill to use ikiwiki.

Syndicated 2014-04-30 15:06:01 from see shy jo

the real Brazil

(click to enlarge)

"This is the real Brazil" -- Fernao

After 3 days of travel, including 22 hours driving from Brasilia to the coast of Bahia, I can't say much more. A few things not pictured above..

The two hours of car-swallowing potholes in an arrow-straight highway in a flat plain, playing chicken against oncoming trucks.

Night swim in the river in Correntina, followed by a whole grilled fish and ice cold guarana. Bliss.

The first sight of ocean waves and the drum circle last night.

Syndicated 2014-04-29 18:03:28 from see shy jo

propellor-driven DNS and backups

Took a while to get here, but Propellor 0.4.0 can deploy DNS servers and I just had it deploy mine. Including generating DNS zone files.

Configuration is dead simple, as far as DNS goes:

     & Dns.secondary hosts "joeyh.name"
                & Dns.primary hosts "example.com"
                        ( Dns.mkSOA "ns1.example.com" 100
                                [ NS (AbsDomain "ns1.example.com")
                                , NS (AbsDomain "ns2.example.com")
                        ) []

The awesome thing is that propellor fills in all the other information in the zone file by looking at the properties of the hosts it knows about.

 , host "blue.example.com"
        & ipv4 ""
        & ipv6 "fe80::26fd:52ff:feea:2294"

        & alias "example.com"
        & alias "www.example.com"
        & alias "example.museum"
        & Docker.docked hosts "webserver"
            `requres` backedup "/var/www"

When it sees this host, Propellor adds its IP addresses to the example.com DNS zone file, for both its main hostname ("blue.example.com"), and also its relevant aliases. (The .museum alias would go into a different zone file.)

Multiple hosts can define the same alias, and then you automaticlly get round-robin DNS.

The web server part of of the blue.example.com config can be cut and pasted to another host in order to move its web server to the other host, including updating the DNS. That's really all there is to is, just cut, paste, and commit!

I'm quite happy with how that worked out. And curious if Puppet etc have anything similar.

One tricky part of this was how to ensure that the serial number automtically updates when changes are made. The way this is handled is Propellor starts with a base serial number (100 in the example above), and then it adds to it the number of commits in its git repository. The zone file is only updated when something in it besides the serial number needs to change.

The result is nice small serial numbers that don't risk overflowing the (so 90's) 32 bit limit, and will be consistent even if the configuration had Propellor setting up multiple independent master DNS servers for the same domain.

Another recent feature in Propellor is that it can use Obnam to back up a directory. With the awesome feature that if the backed up directory is empty/missing, Propellor will automcatically restore it from the backup.

Here's how the backedup property used in the example above might be implemented:

backedup :: FilePath -> Property
backedup dir = Obnam.backup dir daily
    [ "--repository=sftp://rsync.example.com/~/webserver.obnam"
    ] Obnam.OnlyClient
    `requires` Ssh.keyImported SshRsa "root"
    `requires` Ssh.knownHost hosts "rsync.example.com" "root"
    `requires` Gpg.keyImported "1B169BE1" "root"

Notice that the Ssh.knownHost makes root trust the ssh host key belonging to rsync.example.com. So Propellor needs to be told what that host key is, like so:

 , host "rsync.example.com"
        & ipv4 ""
        & sshPubKey "ssh-rsa blahblahblah"

Which of course ties back into the DNS and gets this hostname set in it. But also, the ssh public key is available for this host and visible to the DNS zone file generator, and that could also be set in the DNS, in a SSHFP record. I haven't gotten around to implementing that, but hope at some point to make Propellor support DNSSEC, and then this will all combine even more nicely.

By the way, Propellor is now up to 3 thousand lines of code (not including Utility library). In 20 days, as a 10% time side project.

Syndicated 2014-04-19 07:08:45 from see shy jo

propellor introspection for DNS

In just released Propellor 0.3.0, I've improved improved Propellor's config file DSL significantly. Now properties can set attributes of a host, that can be looked up by its other properties, using a Reader monad.

This saves needing to repeat yourself:

hosts = [ host "orca.kitenet.net"
        & stdSourcesList Unstable
        & Hostname.sane -- uses hostname from above

And it simplifies docker setup, with no longer a need to differentiate between properties that configure docker vs properties of the container:

 -- A generic webserver in a Docker container.
    , Docker.container "webserver" "joeyh/debian-unstable"
        & Docker.publish "80:80"
        & Docker.volume "/var/www:/var/www"
        & Apt.serviceInstalledRunning "apache2"

But the really useful thing is, it allows automating DNS zone file creation, using attributes of hosts that are set and used alongside their other properties:

hosts =
    [ host "clam.kitenet.net"
        & ipv4 ""

        & cname "openid.kitenet.net"
        & Docker.docked hosts "openid-provider"

        & cname "ancient.kitenet.net"
        & Docker.docked hosts "ancient-kitenet"
    , host "diatom.kitenet.net"
        & Dns.primary "kitenet.net" hosts

Notice that hosts is passed into Dns.primary, inside the definition of hosts! Tying the knot like this is a fun haskell laziness trick. :)

Now I just need to write a little function to look over the hosts and generate a zone file from their hostname, cname, and address attributes:

extractZoneFile :: Domain -> [Host] -> ZoneFile
extractZoneFile = gen . map hostAttr
  where gen = -- TODO

The eventual plan is that the cname property won't be defined as a property of the host, but of the container running inside it. Then I'll be able to cut-n-paste move docker containers between hosts, or duplicate the same container onto several hosts to deal with load, and propellor will provision them, and update the zone file appropriately.

Also, Chris Webber had suggested that Propellor be able to separate values from properties, so that eg, a web wizard could configure the values easily. I think this gets it much of the way there. All that's left to do is two easy functions:

overrideAttrsFromJSON :: Host -> JSON -> Host

exportJSONAttrs :: Host -> JSON

With these, propellor's configuration could be adjusted at run time using JSON from a file or other source. For example, here's a containerized webserver that publishes a directory from the external host, as configured by JSON that it exports:

demo :: Host
demo = Docker.container "webserver" "joeyh/debian-unstable"
    & Docker.publish "80:80"
    & dir_to_publish "/home/mywebsite" -- dummy default
    & Docker.volume (getAttr dir_to_publish ++":/var/www")
    & Apt.serviceInstalledRunning "apache2"

main = do
    json <- readJSON "my.json"
    let demo' = overrideAttrsFromJSON demo
    writeJSON "my.json" (exportJSONAttrs demo')
    defaultMain [demo']

Syndicated 2014-04-11 05:05:54 from see shy jo

Kite: a server's tale

My server, Kite, is finishing its 20th year online.

It started as kite.resnet.cornell.edu, a 486 under the desk in my dorm room. Early on, it bounced around the DNS -- kite.ithaca.ny.us, kite.ml.org, kite.preferred.com -- before landing on kite.kitenet.net. The hardware has changed too, from a succession of desktop machines, it eventually turned into a 2u rack-mount server in the CCCP co-op. And then it went virtual, and international, spending a brief time in Amsterdam, before relocating to England and the kvm-hosting co-op.

Through all this change, and no few reinstalls from scratch, it's had a single distinct personality. This is a multi-user unix system, of the old school, carefully (and not-so-carefully) configured and administered to perform a grab-bag of functions. Whatever the users need.

I read the olduse.net hacknews newsgroup, and I see, in their descriptions of their server in 1984, the prototype of Kite and all its ilk.

It's consistently had a small group of users, a small subset of my family and friends. Not quite big enough to really turn into a community, and we wall and talk less than we once did.

Exhibit: Kite as it appeared in the 90's

[Intentionally partially broken, being able to read the cgi source code is half the fun.]

Kite was an early server on the WWW, and garnered mention in books and print articles. Not because it did anything important, but because there were few enough interesting web sites that it slightly stood out.

Many times over these 20 years I've wondered what will be the end of Kite's story. It seemed like I would either keep running it indefinitely, or perhaps lose interest. (Or funding -- it's eaten a lot of cash over the years, especially before the current days of $5/month VPS hosting.) But I failed to anticipate what seems to really be happening to it. Just as I didn't fathom, when kite was perched under my desk, that it would one day be some virtual abstract machine in a unknown computer in anther country.

Now it seems that what will happen to Kite is that most of the important parts of it will split off into a constellation of specialized servers. The website, including the user sites, has mostly moved to branchable.com. The DNS server, git server and other crucial stuff is moving to various VPS instances and containers. (The exhibit above is just one more automatically deployed, soulless container..) A large part of Kite has always been about me playing with bleeding-edge stuff and installing random new toys; that has moved to a throwaway personal server at cloudatcost.com which might be gone tomorrow (or might keep running for free for years).

What it seems will be left is a shell box, with IMAP access to a mail server, and a web server for legacy /~user/ sites, and a few tools that my users need (including that pine program some of them are still stuck on.)

Will it be worth calling that Kite?

[ Kite users: This transition needs to be done by December when the current host is scheduled to be retired. ]

Syndicated 2014-04-10 15:17:38 from see shy jo

propellor type-safe reversions

Propellor ensures that a list of properties about a system are satisfied. But requirements change, and so you might want to revert a property that had been set up before.

For example, I had a system with a webserver container:

  Docker.docked container hostname "webserver"

I don't want a web server there any more. Rather than having a separate property to stop it, wouldn't it be nice to be able to say:

  revert (Docker.docked container hostname "webserver")

I've now gotten this working. The really fun part is, some properies support reversion, but other properties certianly do not. Maybe the code to revert them is not worth writing, or maybe the property does something that cannot be reverted.

For example, Docker.garbageCollected is a property that makes sure there are no unused docker images wasting disk space. It can't be reverted. Nor can my personal standardSystem Unstable property, which amoung other things upgrades the system to unstable and sets up my home directory..

I found a way to make Propellor statically check if a property can be reverted at compile time. So revert Docker.garbageCollected will fail to type check!

The tricky part about implementing this is that the user configures Propellor with a list of properties. But now there are two distinct types of properties, revertable ones and non-revertable ones. And Haskell does not support heterogeneous lists..

My solution to this is a typeclass and some syntactic sugar operators. To build a list of properties, with individual elements that might be revertable, and others not:

        & standardSystem Unstable
        & revert (Docker.docked container hostname "webserver")
        & Docker.docked container hostname "amd64-git-annex-builder"
        & Docker.garbageCollected

Syndicated 2014-04-02 17:09:02 from see shy jo

adding docker support to propellor

Propellor development is churning away! (And leaving no few puns in its wake..)

Now it supports secure handling of private data like passwords (only the host that owns it can see it), and fully end-to-end secured deployment via gpg signed and verified commits.

And, I've just gotten support for Docker to build. Probably not quite work, but it should only be a few bugs away at this point.

Here's how to deploy a dockerized webserver with propellor:

host hostname@"clam.kitenet.net" = Just
    [ Docker.configured
    , File.dirExists "/var/www"
    , Docker.hasContainer hostname "webserver" container

container _ "webserver" = Just $ Docker.containerFromImage "joeyh/debian-unstable"
        [ Docker.publish "80:80"
        , Docker.volume "/var/www:/var/www"
        , Docker.inside
            [ serviceRunning "apache2"
                `requires` Apt.installed ["apache2"]

Docker containers are set up using Properties too, just like regular hosts, but their Properties are run inside the container.

That means that, if I change the web server port above, Propellor will notice the container config is out of date, and stop the container, commit an image based on it, and quickly use that to bring up a new container with the new configuration.

If I change the web server to say, lighttpd, Propellor will run inside the container, and notice that it needs to install lighttpd to satisfy the new property, and so will update the container without needing to take it down.

Adding all this behavior took only 253 lines of code, and none of it impacts the core of Propellor at all; it's all in Propellor.Property.Docker. (Well, I did need another hundred lines to write a daemon that runs inside the container and reads commands to run over a named pipe... Docker makes running ad-hoc commands inside a container a PITA.)

So, I think that this vindicates the approach of making the configuration of Propellor be a list of Properties, which can be constructed by abitrarily interesting Haskell code. I didn't design Propellor to support containers, but it was easy to find a way to express them as shown above.

Compare that with how Puppet supports Docker: http://docs.docker.io/en/latest/use/puppet/

docker::run { 'helloworld':
  image        => 'ubuntu',
  command      => '/bin/sh -c "while true; do echo hello world; sleep 1; done"',
  ports        => ['4444', '4555'],

All puppet manages is running the image and a simple static command inside it. All the complexities that puppet provides for configuring servers cannot easily be brought to bear inside the container, and a large reason for that is, I think, that its configuration file is just not expressive enough.

Syndicated 2014-04-01 08:22:41 from see shy jo

547 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!