Older blog entries for robertc (starting at number 144)

Using UEC instead of EC2


So, we wanted to move a Hudson CI server at Canonical from using chroots to VM’s (for better isolation and security), and there is this great product Ubuntu Enterprise Cloud (UEC – basically Eucalyptus). To do this I needed to make some changes to the Hudson EC2 plugin – and thats where the fun starts. While I focus on getting Hudson up and running with UEC in this post, folk generally interested in the differences between UEC and EC2, or getting a single-machine UEC instance up for testing should also find this useful.

Firstly, getting a test UEC instance installed was a little tricky – I only had one machine to deploy it on, and this is an unusual configuration. Nicely though, it all worked, once a few initial bugs and misconfiguration items got fixed up. I wrote up the crux of the outcome on the Ubuntu community help wiki. See ‘1 Physical system’. The particular trap to watch out for seems to be that this configuration is not well tested, so the installation scripts have a hard time getting it right. I haven’t tried to make it play nice with Network Manager in the loop, but I’m pretty sure that that can be done via interface aliasing or something similar.

Secondly I needed to find out what was different between EC2 and UEC (Note that I was running on Karmic (Ubuntu 9.10) – so things could be different in Lucid). I couldn’t find a simple description of this, so this list may be incomplete:

  1. UEC runs an old version of the EC2 API. This is because it hasn’t implemented everything in the new API versions yet.
  2. UEC defaults to port 8773, not port 80 (for both the EC2 and S3 API’s)
  3. The EC2 and S3 API’s are rooted differently: at AWS they are at /, for UEC they are at /services/Eucalyptus and /services/Walrus
  4. UEC doesn’t supply a SSL API port as far as I can tell.
  5. DescribeImages has something wonky with it.

So the next step then is to modify the Hudson EC2 plugin to support these differences. Fortunately it is in Java, and the Java community has already updated the various libraries (jets3t and typica) to support UEC – I just needed to write a UI for the differences and pass the info down the various code paths. Kohsuke has let me land this now even though it has an average UI (in rev 27366), and I’m going to make the UI better now by consolidating all the little aspects into a couple of URL’s. Folk comfortable with building their own .hpi can get this now by svn updating and rebuilding the ec2 plugin. We’ve also filed another bug asking for a single API call to establish the endpoints, so that its even easier for users to set this up.

Finally, and this isn’t a UEC difference, I needed to modify the Hudson EC2 plugin to work with the ubuntu user rather than root, as Ubuntu AMI’s ship with root disabled (as all Ubuntu installs do). I chose to have Hudson reenable root, rather than making everything work without root, because the current code paths assume they can scp things as root, so this was less disruptive.

With all that done, its now possible to configure up a Hudson instance testing via UEC nodes. Here’s how:

  1. Install UEC and make sure you can run up instances using euca-run-instances, ssh into them and that networking works for you. Make sure you have installed at least one image (EMI aka AMI) to run tests on. I used the vanilla in-store UEC Karmic images.
  2. Install Hudson and the EC2 plugin (you’ll need to build your own until a new release (1.6) is made).
  3. Go to /configure and near the bottom click on ‘Add a new cloud’ and choose Amazon EC2.
  4. Look in ~/.euca/eucarc, or in the zip file that the UEC admin web page lets you download, to get at your credentials. Fill in the Access Key and Secret Access key fields accordingly. You can put in the private key (UEC holds onto the public half) that you want to use, or (once the connection is fully setup) use the “Generate Key’ button to have a dedicated Hudson key created. I like to use one that I can ssh into to look at a live node – YMMV. (Or you could add a user and as many keys as you want in the init script – more  on that in a second).
  5. Click on Advanced, this will give you a bunch of details like ‘EC2 Endpoint hostname’. Fill these out.
  6. Sensible values for a default UEC install are: 8773 for both ports, /services/Eucalyptus and /services/Walrus for the base URLs, and SSL turned off. (Note that the online help tells you this as well).
  7. Set an instance cap, unless you truely have unlimited machines. E.g. 5, to run 5 VMs at a time.
  8. Click on ‘Test Connection’ – it should pretty much instantly say ‘Success’.
  9. Thats the Cloud itself configured, now we configure VM’s that Hudson is willing to start. Click on ‘Add’ right above the ‘List of AMIs to be launched as slaves’ text.
  10. Fill out the AMI with your EMI – e.g. emi-E027107D is the Ubuntu 9.10 image I used.
  11. for remote FS root, just put /hudson or something, unless you have a preseeded area (e.g. with a shared bzr repo or something) inside your image.
  12. For description describe the intent of the image – e.g. ‘DB test environment’
  13. For the labels put one or more tags that you will use to tell test jobs they should run on this instance. They can be the same as labels on physical machines – it will act as an overflow buffer. If no physical machines exist, a VM will be spawned when needed. For testing I put ‘euca’
  14. For the init script, its a little more complex. You need to configure up java so that hudson itself can run:
    cat >> /etc/apt/sources.list << EOF
    deb http://archive.ubuntu.com/ubuntu/ karmic multiverse
    deb http://archive.ubuntu.com/ubuntu/ karmic-updates multiverse
    deb http://archive.ubuntu.com/ubuntu/ karmic-security multiverse
    EOF
    export http_proxy=http://192.168.1.1:8080/
    export DEBIAN_FRONTEND=noninteractive
    apt-get update
    echo "buildd shared/accepted-sun-dlj-v1-1 boolean true" | debconf-set-selections
    apt-get install -y -f sun-java6-jre
    

    Note that I have included my local HTTP proxy there – just remove that line if you don’t have one.

  15. Click on Advanced, to get at the less-common options.
  16. For remote user, put ‘ubuntu’, and for root command prefix put ’sudo’.
  17. For number of executors, you are essentially choosing the number of CPU’s that the instance will request. E.g. putting 20 will ask for an extra-large high-cpu model machine when it deploys. This will then show up as 20 workers on the same machine.
  18. Click save :)
  19. Now, when you add a job a new option in the job configuration will appear – ‘tie this job to a node’. Select one of the label(s) you put in for the AMI, and running the job will cause that instance to start up if its not already available.

Note that Hudson will try to use java from s3 if you don’t install it, but that won’t work right for a few reasons – I’ll be filing an issue in the Hudson tracker about it, as thats a bit of unusual structure in the existing code that I’m happier leaving well enough alone :) .

Syndicated 2010-02-11 01:41:53 from Code happens

Is a code of silence evil?


Looking at using google apps for my home email, as I want to be able to have my home machines totally turned off from time to time.

Found this interesting gem in the sign up agreement (which I have not yet agreed to :P ):

11. PR. Customer agrees not to issue any public announcement regarding the existence or content of this Agreement without Google’s prior written approval. Google may (i) include Customer’s Brand Features in presentations, marketing materials, and customer lists (which includes, without limitation, customer lists posted on Google’s web sites and screen shots of Customer’s implementation of the Service) and (ii) issue a public announcement regarding the existence or content of this Agreement. Upon Customer’s request, Google will furnish Customer with a sample of such usage or announcement.

This is rather asymmetrical: If I agree to the sign up page, I cannot say ‘I am using google apps’, but google can say ‘Robert is using google apps’. While I can appreciate not wanting to be dissed on if something goes wrong, this is very much not open! A couple of implications: Everyone seeking support for google apps in the apps forums is probably in violation of the sign up agreement; we can assume that anyone having a terrible experience has been squelched under this agreement.

Le sigh.

Syndicated 2010-02-09 08:47:47 from Code happens

Adding new languages to Ubuntu


Scott recently noted that we don’t have Klingon available in Ubuntu. Klingon is available in ISO 639, so adding it  should be straight forward.

Last time I blogged about this three packages needed changing, as well as Launchpad needing a translation team for the language. The situation is a little better now: only two packages need changing as gdm now dynamically looks for languages based on installed locales.

libx11 still needs changing – a minimal diff would be:

=== modified file 'nls/compose.dir.pre'
--- libx11-1.2.1/nls/compose.dir.pre
+++ libx11-1.2.1/nls/compose.dir.pre
@@ -406,0 +406,1 @@
+en_US.UTF-8/Compose:        tlh_GB.UTF-8
=== modified file 'nls/locale.alias.pre'
--- libx11-1.2.1/nls/locale.alias.pre
+++ libx11-1.2.1/nls/locale.alias.pre
@@ -1083,0 +1083,1 @@
+tlh_GB.utf8:                    tlh_GB.UTF-8
 === modified file 'nls/locale.dir.pre'
--- libx11-1.2.1/nls/locale.dir.pre
+++ libx11-1.2.1/nls/locale.dir.pre
@@ -429,0 +429,1 @@
+en_US.UTF-8/XLC_LOCALE:            tlh_GB.UTF-8
 

Secondly, langpack-locales has to change for two reasons. Firstly a locale definition has to be added (and locales define a place – a language and locale information like days of the week, phone number formatting etc. Secondly the language needs to be added to the SUPPORTED list in that package, so that language packs are generated from Launchpad translations.

Now, gdm autodetects, but it turns out that only ‘complete’ locales were being shown. And that on Ubuntu, this was not looking at language pack directories, rather at

/usr/share/locale

which langpack-built packages do not install translations into. So it could be a bit random about whether a language shows up in gdm. Martin Pitt has kindly turned on the ‘with-incomplete-locales’ configure flag to gdm, and this will permit less completely translated locales to show up (when their langpack is installed – without the langpack nothing will show up).

Syndicated 2010-02-05 23:42:42 from Code happens

lCA 2010 Friday


Tridge on ‘Patent defence for open source projects’. Watch it! Some key elements:

  • prior art defence is very very hard – ‘non infringement’ is a much better defense because you only need to show you don’t do the independent claims.
  • Reading a patent doesn’t really harm us because triple damages is no less fatal than single damages :) Reading patents to avoid them is a good idea.
  • Dealing with patents is very technical. It needs training (and the talk has that training)
  • Patents are hard to read.
  • Claims are often interpreted much more specifically than engineers expect.
  • Best prior art is our own source code, with VCS date records and the exact date often matters because of the riority date.
  • Invalidation: dead patents are vampires… and when they come back they are harder to kill again. Read the file wrapper – audit log containing all correspondence <-> patent office and applicant.
  • Patents are not code: judges can vary the meaning.
  • Claim charts are what you use to talk to patent lawyers.
  • Build workarounds *and publish them*. Encourage others to adopt them.

Syndicated 2010-01-24 02:18:35 from Code happens

LCA 2010 Friday keynote/lightning talks


Nathan Torkington on 3 lightning keynotes:

1) Lessons learnt!

‘Technology solves problems’… no it doesn’t, its all about the meatsacks!

‘If you live a good life you’ll never have to care about marketing’… steer the meatsacks

‘English is an imperative language for controlling meatsacks.’… Tell the smart meatsacks what you want (english is declarative).

2) Open source in New Zealand:

A bit of a satire :) ‘Sheep calculator’, tatoos as circuit diagrams. The reserve bank apparently has a *working* water-economy-simulator. Shades of Terry Pratchett!

3) Predictions – more satire about folk that make predictions – financial analysts, science journalists.

After that, it was lightning talk time. I’ve just grabbed some highlights.

Selena Deckelmann talked about going to Ondo in Nigeria and un-rigging an election:

  1. Run for political office.
  2. Lose – but polls had suggested the reverse result
  3. Don’t give up – protest file May 14 2007
  4. Use technology – fingerprint scanning – 84814 duplicate fingerprints, 360 exactly the same fingerprints
  5. Patience – 2 years and the courts reversed the election

http://flossmanuals.net – nice friendly manuals in many languages writen at book sprints.

Kate Olliver presented on making an origami penguin.

Mark Osbourne presented ‘Open Source School’ – a school in New Zealand that has gone completely open source, even though the NZ school system pays microsoft 10Million/year for a country wide license.

Syndicated 2010-01-21 21:04:59 from Code happens

LCA 2010 Thursday


Jeremy Allison on ‘The elephant in the room – free software and microsoft’. While he works at Google, this talk was ‘off the leash’ – not about Google :) . As usual – grab the  video :) We should care about Microsoft because Microsoft’s business model depends on a monopoly [the desktop]. Microsoft are very interested in ‘Open Source’ – Apache, MIT, BSD licenced software – the GPL is intolerable. Jeremy models Microsoft as a collection of warring tribes that hate each other… e.g. Word vs Excel.

The first attack was on protocols – make the protocols more complex and sophisticated. MS have done this on Kerberos, DCE/RPC, HTTP, and higher up the stack via MSIE rendering modes, ActiveX plugins, Silverlight…  The EU case was brought about this in the ‘Workgroup Server Market’. MS were fined 1 Billion Euros and forced to document their proprietary protocols.

OOXML showed up rampant corruption in the ISO Standards process – but it got through even though it was a battle against nearly everyone! On the good side it resulted into an investigation into MS dominance in file formats -> MS implemented ODF and MS have had to document their old formats.

MS have an ongoing battle in the world wide web – IE / Firefox, ajax applications/ silverlight.

All of these things are long term failures for MS… so what next?… Patents :( . Patents are GPL incompatible, but fine with BSD/MIT. The Tom Tom is the first direct attack using MS’s patent portfolio. This undermines all the outreach work done by the MS Open Source team – which Jeremy tells us are true believers in open source, trying to change MS from the inside. Look for MS pushing RAND patented standards: such things lock us out.

Netbooks are identified as a key point for MS to fight on – lose that and the desktop position is massively weakened.

We should:

  • Keep creating free software and content *under a copyleft license*.
  • Keep pressure on Governments and organisations to adopt open standards and investigate monopolies.
  • Lobby against software patents.
  • Search for prior art on relevant patents and destroy them.
  • Working for a corporation is a moral choice: respectfully call out MS employees.

Jonathan Oxer spoke about the google Moon X-prize and the lunarnumbat.org project – it needs contributors: software and hardware hackers, arduino/beagleboard/[M]JPEG2000 gooks, code testers and reviewers, web coding, documentation, math heads & RF hackers. Sounds like fun… now to find time!

Paul McKenney did another RCU talk – and as always it was interesting… Optimisation Gone Bad (RCU in Linux 1993-2008). Linux 2.6 -rt patch made RCU much much much more complex with atomic operations, memory barriers, frequent cache misses, and since then it was slowly being whittled back, but there is now a new simpler RCU based around the concept of doing the accounting during context switches & tracking running tasks.

Syndicated 2010-01-20 23:28:41 from Code happens

LCA 2010 Thursday Keynote – Glyn Moody


Glyn Moody – Hackers at the end of the world. Rebel code is now 10 years old… 50+ interviews over a year – and could be considered an archaeology now :)   I probably haven’t down the keynote justice – it was excellent but high density – you should watch it online ;)

Glyn talks about open access – various examples like the public library of science (and how the scientific magazine business made 30%-40% profit margins. The Human Genome Project & the ‘Bermuda Principles’: public submimssion of annotated sequences. In 2000 Celera were going to patent the entire human genome. Jim Kent spent 3 weeks writing a program to join together the sequenced fragments on a 100 PC 800Mhz Pentium processor.  This was put into the public domain on just before Celera completed their processing – and by that action Celera were prevented from patenting *us*.

Openness as a concept is increasing within the scientific community – open access to result, open data, open science (the full process). An interesting aspect to it is ‘open notebook science’ – daily writeups, not peer reviewed: ‘release early, release often’ for science.

Amazingly, Project Gutenberg started in 1971!

Glyn ties together the scientific culture (all science is open to some degree) and artistic culture (artists share and build on /reference each others work) by talking about a lag between free software and free content worlds. In 1999 Larry Lessig setup ‘Copyright’s Commons’ built around an idea of ‘counter-copyright’ – copyleft for non-code. This didn’t really fly, and Creative Commons was setup 2 years later.

Wikipedia and newer sharing concepts like twitter/facebook etc are discussed. But… what about the real world: transparency and governments, or companies? They are opening up.

However, data release !=  control release. And there are challenges we all need to face:

  • GFinancialC “my gain is your loss”. Very opaque system.
  • GEnvironmentalC “my gain is our loss”

Glyn argues we need a different approach to economic governance: the commons. 2009 Nobel laureate for Economic Sciences – Elinor Ostrom – work on commons and their management via user associations… which is what we do in open source!

Awesome!

Syndicated 2010-01-20 21:18:11 from Code happens

LCA 2010 Wednesday


Pandora-build. There for support – I’ve contributed patches. Pandora is a set of additional glue and layers to improve autotools and make it easier to work with things like gettext and gnulib, turn on better build flags and so forth. If you’re using autotools its well worth watching this talk – or hop on #drizzle and chat to mtaylor :)

The open source database survey talk from Selena was really interesting – a useful way of categorising databases and a list of what db’s turned up in what category. E.g. high availability,community development model etc. Key takeaway: there is no one-true-db.

I gave my subunit talk in the early afternoon, reasonably well received I think, though I wish I had been less sick last week: I would have loved to have made the talk more polished.

Ceph seems to be coming along gangbusters. Really think it would be great to use for our bzr hosting backend. 0.19 will stablise the disk format! However we might not be willing to risk btrfs yet :(

Next up, the worst inventions ever.. catch it live if you can!

Syndicated 2010-01-20 03:30:01 from Code happens

LCA2010 Wednesday Keynote


Another must-grab-the-video talk : Mako’s keynote. Antifeatures, principles vs pragmatism do come together. The principled side – RMS & the FSF – important to control ones technology because its important to control ones life. The pragmatic side – quality, no vendor lock etc. False dichotomy.. freedom imparts pragmatic benefits even though it doesn’t intrinsically import quality, good design:  95% of projects 5 contributors; median number of contributors 1, and such small collaborations are no different than a closed source one.

Definition of antifeatures – built functionality to make a product do something one does not want it to do. Great example of phone books: spammers pay for access to the lists, and thus we have to pay *not to be listed*, but its actually harder to list and print our numbers in the first place. Mako makes a lovely analogy to the mafia there. Similarly with Sony charging 50 dollars not to install trialware on windows laptops in the past.

Cameras: Canon cameras disabled RAW saving…. CHDK, an open source addon for the camera outputs RAW again. Panasonic are locking down their cameras to reject third party batteries.

The tivo is an example of how focusing on licensing can miss the big picture: free stack, but still locked into a subscription to get an ongoing revenue stream.

Dongles! Mako claimed there wasn’t a facebook appreciation group for dongles… there is.

Github: paying for the billing model – lots of code there to figure out how many projects in a repo, so that they can charge on that basis.

DRM is the ‘mother of all antifeatures’ – 10K people writing DRM code that no users want!

Syndicated 2010-01-19 21:02:44 from Code happens

LCA 2010 Tuesday


Gabriella Colemans keynote was really good; grab it from the videos once they come online.

WETA run Ubuntu for their render farm: 3700 machines, 35000 cores, 7kw per ‘cold’ rack and 22kw per ‘hot’ rack. (Hot racks are rendering, cold racks are storage). Wow. Another talk well worth watching if you are at all interested in the issues related to running large numbers of very active machines in a small space.

And a classic thing from the samba4 talk at the start of the afternoon: MS AD domain controllers do no validation of updates from other domain controllers: classic crunchy surface security. (Discovered by samba4 borking AD while testing r/w replica mode).

Blue-ray on linux is getting there, however one sad thing is that the Blue ray standard has no requirement that vendors make players be able to play un-encrypted content – and there are some hints that in fact licences may require them to not play un-encrypted content.

Peter Chubb’s talk on Articulate was excellent for music geeks: midi that sounds like music from lillypond.

Ben Balbo talked about ‘Roll your own dropbox’. Ben works at a multimedia agency, but the staff work locally and don’t use the file server…. use instant messenger to send files around! Tried using subversion… too hard. Dropbox looked good but 3-7 hundred a month – too pricey given an existing 1.4TB of spare capacity.

He then considered svn + cron but deleted directories cause havoc & something automatic was wanted… so git + cron instead. Key thing used in doing this was having a work area with absolutely no metadata. Conflicts dealt with by filename.conflict.DATESTAMP.HOSTNAME.origextention

Doesn’t trigger of inotify, no status bar widget, only single user etc at the moment, but was written to meet the office needs so is sufficient. Interestingly he hadn’t looked at e.g. iFolder.

Syndicated 2010-01-19 04:39:05 from Code happens

135 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!