Heard around the office: "ClearCase is so good, I encourage all our competitors to buy it." (Oops, I guess they did! :-)
I started writing a macrobenchmark/test for distcc. Inspired by GAR and GARNOME, it downloads, configures, and tries to build various large packages, timing the local and distributed build times. It complements the test suite, which checks correctness on small interesting cases, by feeding through a lot of valid diverse cases.
It reveals that performance across 3 machines is typically 2.0 to 2.9 times better. For any given project the results are quite reproducible. Presumably the slow ones have either lots of non-parallelizable or non-distributable work, or something about their Makefiles is not handled well.
Another way to look at this is that distcc is about 60% to 90% of the theoretical limit of 3.0x faster. Typically parallelization incurs some cost; 90% is not bad. I wonder how much of the loss is unavoidable? distcc itself does not use many cycles, but the scheduler that distributes where to compile a particular file is not optimal.
Python is excellent for this -- so easy to write very concise and clear tests.
Testing is so fun once you get into the swing of it. There's really a lot of creativity in trying to work out how to exercise a particular aspect, either by improving the program's testability or by writing a harness or driver.
I'm reading an ACM anthology on automated testing. I forget the name. More on this later.
