30 Apr 2008 JoeNotCharles   » (Journeyer)

“Troubleshooting icecream” is not a helpful search


I’ve been a happy distcc user for many years - every time I got a new computer, I tended to have the old one still sitting around with nothing much to do except be part of a compile farm. Really, what else are you going to do with the old thing? But last time I upgraded, it was because the old computer completely died, so for a year or so I’ve been laptop-only and distcc-free.

However! My new job sent me a company laptop, and the combination of two computers in the house plus big projects to compile from scratch means it’s time to get distcc back in shape. Except the new laptop came with SuSE (which I’m not a fan of, so far), and SuSE’s package manager only contained some new thing called Icecream.

Icecream has an unfortunate naming scheme for at least three reasons: if I need help, do I Google for “icecream” or “icecc”, the actual binary name? (This is compounded by the fact that it’s so new there’s very little documentation out there to find, and it’s pretty much overwhelmed by hits about actual ice cream. Mmmm.) Ubuntu has a completely different package called icecream - name collision! And finally, they had the gall to name their work scheduler “scheduler”, which tells you nothing about what it’s for - Ubuntu sensibly renamed it to “icecc-scheduler”.

Apart from that, though, this is a pretty nice improvement. The first obvious difference is that there’s no more manually keeping DISTCC_HOSTS in sync - thank God! One server runs the aforementioned “scheduler” daemon, which keeps track of which hosts are available and how loaded they are, and every host that wants to take on jobs runs the “iceccd” daemon, which - in theory - finds the scheduler by multicast. Even if multicast fails (see below) you just have to feed them the one scheduler hostname manually, instead of the entire list.

The other big improvement is that icecc has the ability to package up a toolchain in a chroot, and transfer the whole thing to a compile host. That means you don’t have to worry about keeping your tools exactly in sync on each host any more (although you can avoid overhead by doing so). I haven’t tried it yet, but if it works it will save me a lot of headache, since I have no intention of taking Ubuntu off my personal laptop and I can see things getting out of sync in a hurry. It can even be used for cross-compiles.

And now the bad: like I said, it’s new and there are very few docs. (EDIT: apparently it’s not so new, but it’s still harder to find information for it than for distcc. I guess I should have said it has less mindshare.) The page I linked to is about it, really. So as a public service, here are some troubleshooting tips, in the order I figured them out:

  • The main symptom of icecc not working is that all compiles go to localhost only. There are two main causes: iceccd on other hosts can’t find the scheduler, so they never register, or they register but the scheduler decides not to use them.
  • To check that a host is registering, run icecc with the “-vvv” (very verbose) flag, and tail -f the log file (/var/log/iceccd on SuSE, /var/log/iceccd.log on Ubuntu). If you see “scheduler not yet found.” more than once or twice, this is the problem - you’ll have to pass the scheduler hostname manually with the -s flag. Or just pass the hostname manually from the start, because in my experience the autodetection never works.
  • If you see “connection failed”, it’s found the host but couldn’t connect for some reason. Check your firewall settings.
  • If all the hosts are connected (you’ll be able to see them in the detailed hosts view of iccmon) but the schedular still sends all jobs to localhost, check the scheduler log (/var/log/icecc-scheduler.log on Ubuntu; I can’t remember the name on SuSE, because I renamed mine to match Ubuntu’s and forgot to write down the original). Again, use “-vvv”. You’ll see “login protocol version: ” lines, again confirming the hosts are registered, followed by lots of “handle_local_job” lines, which confirm that the scheduler’s not sending anything out.
  • Make sure all the hosts have the same network name. Ubuntu’s default parameters set the network name to ICECREAM; SuSE’s left it blank. I didn’t trust leaving it blank so I added a -n parameter to the command line on SuSE. I don’t know if it was necessary, because my problem turned out to be the next one:
  • Make sure all hosts use the same protocol version. Turns out the Ubuntu package I installed was a few versions behind the SuSE version, so it used protocol version 27 while SuSE used version 29. (icecc developers: if your protocol isn’t backwards compatible, this would be a WONDERFUL thing to give an error message about!) I ended up compiling from source on both machines to make sure I had the same version - fortunately, it’s small.

Whew. Yes, it was a bit of a pain in the ass to get working - I hope cross-compiling works, or I’ll have to ask myself whether it would’ve been easier to just track down distcc for SuSE in the first place.

Syndicated 2008-04-25 04:20:14 from I Am Not Charles

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!