Older blog entries for lmb (starting at number 84)

  • This week I spent most days in Nuremberg to catch up with colleagues and friends, have lunch, have dinner, and attend a few meetings. It's always good to meet in person from time to time.
  • After having been reminded, at length, that
    1. I always had a misunderstanding of how one piece of code worked (and so couldn't possibly be observing what I've been seeing, and of course never having been told that it was by design), not having written it myself and not understanding it as well as the author, and
    2. how I'm very certainly confused about how another bit is designed (so the issue I was seeing for sure must be due to my testing and not with the code), not having written nor designed it myself etcetera,
    the foot is inserted into the mouth twice within the day. I'd be a better man than I am if I wasn't slightly smug now.
  • What else? More heartbeat testing.
  • I needed to optimize my Xen network setup, but it confused me quite a bit. The network-route and vif-route stuff just didn't seem to work for me. With this howto, I found a very easily followable explanation of achieving the network setup I wanted. Thanks, Arjen!
  • Quite a bit of mail.
17 Oct 2007 (updated 17 Oct 2007 at 12:44 UTC) »
  • (German, because the referenced article is anyway.)
  • Krisen-Kommunikation per E-Mail ist ein Artikel, den es sich zu lesen lohnt. Ich stimme nicht mit allen Punkten überein, gerade das Verstecken persönlicher Verantwortung (nicht: Schuld) stößt mir auf, aber die Kommunikationsstrategien unseres wichtigsten Mediums zu hinterfragen und zu reflektieren kann wertvolle Erkenntnisse bringen.
  • I would very highly recommend Hostage at the Table as a starting point for further reading on the topic of communication in times of crisis; it is not e-mail centric, but contains immensely good advice.
17 Oct 2007 (updated 17 Oct 2007 at 10:27 UTC) »
  • It appears my joy about the excellent test results I've been seeing with CTS on our current release was premature. It held up very well until 5 nodes (which, for an HA fail-over cluster, is quite high), but at 7 nodes, it'll eventually explode. Bother. The bug seems to be in the part of the code maintained by Alan ...
  • Still no replies to my feedback regarding the release plan from Alan, either.
  • Andrew tells me that the work on the openAIS port is progressing nicely, which makes me very happy. I look forward to the cluster infrastructure we run on top to be actively maintained.
  • The testing of the upcoming update for heartbeat, with the intention of shipping it as soon as possible as a maintenance release for SLES10, is coming along rather nicely: the bugs which have surfaced so far mostly have been the test harness looking for log messages which had changed since.
  • The upstream discussion with the project lead on the release cycle are a bit tough, because we seem to only get one reply every few days from him.
  • The new proposal by Alan also calls for timely feedback on assigned bugs. Hrm.

Today, I would simply like to remind everyone about the easiness of getting daily builds of the Linux HA - heartbeat project. I provide them using the openSUSE Build Service for easy download: Try them here!

It goes without saying that you should not do that on your production cluster. However, it is a good idea to give these versions the occasional spin on your staging or test cluster.

If you do not yet have a test or staging cluster, you really should set one up; even though the Linux HA project as well as your distributor (say, Novell for SUSE LINUX Enterprise Server 10) are rather careful about release testing, you should make extra sure before rolling something out into your High Availability setup, which is more than two thirds in the process; even though the software and hardware are important and required, they will be void at root's command ...

Nowadays, nobody has an excuse for not having a test or lab cluster - with virtualization, Xen, vmware, qemu, Lguest ..., one of those idle machines everybody has around becomes an instant cluster.

What's in it for you:

  1. Casually testing daily builds: a few hours.
  2. Reporting a found issue: a few minutes.
  3. Ensuring the coming release is perfect on your cluster: priceless.

Alan Robertson is discussing some high quality advice in his new Managing Computers with Automation blog.

  • Supposedly the Linux HA project is going to get a release plan and release team going forward, for more timely release cycles. The proposal was posted for discussion 3 days ago. I liked it, just had some rather fundamental questions regarding it. No feedback yet from Alan. Doh.
  • Some more words of wisdom: Why the Linux HA v2 configuration sucks

    First, we know about that. Novell has contributed substantial, freely available documentation on the web, and we will work to continue and enhance it, with the goal of incorporating it into the upstream project. And of course, if you think the public Linux-HA.org website sucks^Wleaves a lot to be desired, it is build automatically from this wiki here, so we look forward to your cleanups and contributions.

    However, the main gripe is that Linux HA v2 is complicated to configure. And that gripe is well founded. You see, originally, we had planned that there would be good commandline and GUI tools which would assist the admin. But, due to reorganizations in one of the larger companies contributing to the project, the team working on those was effectively pulled about one to one-and-a-half year into the project, just when they were starting to get really effective.

    The XML-based "configuration" format is, in fact, a declarative language fed to the Policy Engine (a solver). It describes objects (resources and nodes), their dependencies, and the current state of the cluster. If it feels to users as if they are programming, it is because they are. It is not more complex than a scripting language, so any admin should be able to learn it quickly, but it is not something picked up in an hour. We will continue to enhance it going forward of course, but it was never meant to be easy: we had expected smart front-ends, so the back-end could focus on its task.

    Andrew has actually started an Eclipse-based GUI (called Pacemaker). I am sure that he would appreciate help, because I keep him busy on the back-end ;-) We would also welcome someone to work on a consistent CLI shell.

    So, if you think the Linux HA v2 software is hard to use, we are looking forward to receiving patches from you!

75 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!