Advogato: Blog for zanee

Haven't posted in a little while because i've been busy being a father, husband, contracts and other things. Recently i've been doing work with AWS and working at a company not directly free-software related however recently i've been working on an automated image build pipeline, where you can also kick off and manage images in a consistent way. As well as exposing an endpoint that can then be used with other internal applications. Pretty straight-forward non exciting stuff.

Today I came across this tech paper over at Google on automated image building with jenkins and kubernetes. This process is just overly convoluted and need not be for creating an automated image pipeline.

I'll provide a short synopsis of a better way to approach this problem and hopefully follow up with something a little bit more concrete when I get the chance to get my blog back up:

Realistically the problem with images at the end of the day is three-fold. One, it takes a very long time to provision a standard image, so you're dealing with time. Two, the image that has been provisioned eventually becomes stale meaning that the software associated with it needs security patches, bugfixes etc. Three, once you have more than 2 or 3 images, you need a lifecycle for managing the retiring, promoting, validating, testing and etc of images.

So your build process has to revolve around the lifecycle of whatever you need an image for. The best way to achieve this is to completely decouple the build process by itself and the best way to do that is to use a message broker. So you have a message broker, and in front of that you build a web client that is primarily used for publishing of what you'd like for your image and finally have the consumer processes sitting in the background getting ready to chew on the workflow of building an image.

There is obviously a lot more to it than this (what's in an image?, how do we manage these images? retire them? archive them? etc) and I'll hopefully get some time to expound on all of this as was done above. However the most anyone should have to care about are the steps involved in provisioning. Meaning "this is what I want installed on my image" or "this is what I want my image to look like". So in the above example it would be whatever the chef-solo steps involve. In my specific case i'm using ansible (because it's better than chef; yeahhhhh wanna fight?!). Then you don't want to poll github because, well.. why? Even if you wanted an image whenever there was a change in your repo it would be an inefficient way to handle building in a pipeline. What happens if you publish a very trivial change, do you do a full rebuild just because of it? No, you don't want to do that, so just use git webhooks. I'm not sure but it looks like Hashicorp's Atlas has a similar approach. Anyway, this with publishing a simple message to a broker and letting a consumer process do the work is a better approach. Especially because things will definitely fail in an image building pipeline, often enough that you simply need a way to handle this gracefully. All this combined with the fact that no one wants to sit around looking at build output of software installing makes for not a fun time.

So yeah, let me get my shit together and post a more simple approach you can do this with packer, rabbitmq, some python pub/consumer code, ansible and github webhooks (if you're using github). I'll do it with AWS and GCE.. I can't link to a repo because it's private unfortunately BUT the method itself can be disclosed.

27 May 2015 zanee » (Journeyer)