Speeding up cross-compiling with ccache and distcc on Debian
The conventional way of doing embedded development is to cross-compile everything then copy it onto the target,
but working natively allows you to use "normal" tools and workflows.
We want to issue commands directly to a shell on the development board or phone prototype, and speed up the compilation step by distributing it to a faster machine such as your workstation.
This isn't the usual way to do things, but I like working this way, and here's how to make it work faster.
This article explains how to configure a Debian PC host and a Debian target system so that development done on the target invokes the cross-compiler on the host. The advantage offered by this approach is a speed-up of compile times. Note that this does not speed up other aspects of building, such as source configuration (which can be slow for packages using GNU autotools), linking or installation.
We assume that a full Debian system is available for development on the target: packages can be built natively using gcc and a full toolchain (binutils, ld etc.), and
tools such as automake, autoconf, libtool, version control systems etc. are available.
The setup we work with uses Debian on both the host PC and the target.
The examples will use a debian-sh4 on the target, with the sh4-linux-gnu-gcc
cross compiler installed on the build host. For other target architectures, simply replace all instances of sh4-linux-gnu- with the arch prefix, eg. arm-linux-gnueabi-.
In this article, commands executed natively on the target device will use the prompt target#, and commands executed on the x86 build host will use the prompt host#.
The first step is to ensure you can build software natively on the target. For GCC:
target$ gcc hello.c -o hello
and for autotools projects:
target$ ./configure
target$ make
ccache
Next, install ccache:
target# apt-get install ccache
ccache keeps a cache of compiled object files, such that the same compilation does not need to be repeated. This cache exists outside of your source tree, so it persists across invocations of 'make clean'. It compares the pre-processed source files, so that compilation of a source file will happen if it or any of its included headers is changed. The usual way to use ccache is to simply set your C compiler to be "ccache gcc".
target$ ccache gcc hello.c -o hello
and for autotools projects:
target$ CC="ccache gcc" ./configure
target$ make
Debian also sets things up so that if you put /usr/lib/ccache ahead of /usr/bin in your PATH, it will get used for native builds whenever gcc is invoked. That is useful to set up, but not necessary for this setup with distcc.
An aside about compiler naming
Before we move on to cross compiling, it's important to realize that the native compiler is also available with its full architecture prefix:
target$ ls -l /usr/bin/sh4-linux-gnu-gcc
lrwxrwxrwx 1 root root 7 Mar 17 01:45 /usr/bin/sh4-linux-gnu-gcc -> gcc-4.4
The binary called sh4-linux-gnu-gcc does the same thing on both the host and target: you can simply think of it as a program that takes in a C file and produces an sh4 binary:
+-------------------+
C source -> | sh4-linux-gnu-gcc | -> sh4 binary
+-------------------+
The distinction between "native" and "cross-" compiling is then just a matter of what machine you are running this compiler program on. If you run sh4-linux-gnu-gcc
on an x86 machine, you are cross-compiling, but if you run sh4-linux-gnu-gcc on an sh4 machine then you are just compiling. Of course the compiler binaries are
different; the point is that a shell script which calls the compiler by its full name would work without modification on either machine.
distcc
distcc allows you to use a compiler running on a different, faster machine. This involves running a server (distccd) there, and it is far easier to set up than it would seem.
First, ensure that we can cross-compile on the build host:
host$ sh4-linux-gnu-gcc hello.c -o hello
host$ file hello
sh4-linux-gnu-gcc hello.c -o hello
host$ file hello
hello: ELF 32-bit LSB executable, Renesas SH, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, not stripped
Next, we install distcc on the build host:
host# apt-get install distcc
To activate the server and tell it what clients to allow, edit /etc/default/distcc:
STARTDISTCC="true"
ALLOWEDNETS="127.0.0.1 10.0.0.0/16"
and restart it:
host# /etc/init.d/distcc restart
You can check that it is running:
host# netstat -pant | grep distcc
tcp 0 0 10.0.0.1:3632 0.0.0.0:* LISTEN 16142/distccd
So that we can ensure that compilation is running on the host, watch this log file in a separate window:
host# tail -f /var/log/distccd.log
Then, on the client (ie. the target system) we also install distcc:
target# apt-get install distcc
We do not need to modify the distcc configuration on the target as it will not be running the server, so Debian's defaults are fine.
However, we do need to set an environment variable to specify which machine[s] to compile on.
target$ export DISTCC_HOSTS='host'
You run distcc in a similar manner to ccache, by simply setting your C compiler. Note that we are only distributing compilation, not
linking, so we just run the compilation step:
target$ distcc sh4-linux-gnu-gcc -c hello.c
This should turn up in the host's distcc logs:
host# tail -f /var/log/distccd.log
distccd[16390] (dcc_job_summary) client: 10.0.1.103:45983 COMPILE_OK exit:0 sig:0 core:0 ret:0 time:46ms sh4-linux-gnu-gcc hello.c
And back on the target, we have the hello.o file which was generated by the sh4-linux-gnu-gcc cross-compiler on the build host:
target$ ls -l *.o
total 16
-rw-r--r-- 1 conrad conrad 884 Jun 11 07:28 hello.o
target$ file hello.o
hello.o: ELF 32-bit LSB relocatable, Renesas SH, version 1 MathCoPro/FPU/MAU Required (SYSV), not stripped
The C file was transferred over the network to the host, where distccd invoked the cross-compiler and then sent the results back to the target. The end result is the same as
if sh4-linux-gnu-gcc had been run directly on the target, but we avoided using the slower CPU of the target system.
To fully take advantage of distcc, you can run distccd on multiple build hosts, and specify all their names in the DISTCC_HOSTS environment variable on the target.
Then use eg. "make -j 10" to run multiple compiles in parallel, which will each then get farmed out to different build hosts.
Combining ccache and distcc
You can quite simply put these two tools together, by calling:
target$ ccache distcc sh4-linux-gnu-gcc -c hello.c
You can quite simply put these two tools together, by setting CCACHE_PREFIX to "distcc" before calling ccache:
target$ export CCACHE_PREFIX="distcc"
target$ ccache sh4-linux-gnu-gcc -c hello.c
(Thanks to Joel Rosdahl for the correction).
The first time we run this the code is cross-compiled on the build host and sent back to the target, and ccache keeps track of that. The second time we run this, ccache
notices that it already has a stored copy of the output hello.o, and decides to use that rather than calling the compiler. (From ccache's point of view, the compiler is
"distcc sh4-linux-gnu-gcc").
For autotools project, you can simply do the following before calling ./configure:
target$ export CCACHE_PREFIX="distcc"
target$ export CC="ccache sh4-linux-gnu-gcc"
After which the
./configure step will write Makefiles which specify to compile with ccache, so the rest of your build (ie.
make -j 10) just
works as normal without any new settings or any other change to your workflow.
For more discussion of combining distcc with ccache, see the distcc(1) man page.
Summary
By combining both ccache and distcc we can:
- avoid redundant compilations, and
- distribute required compilations to a faster build host.
The result is faster build times, which speeds up your development cycle and allows you to work more efficiently on the target system itself.
Syndicated 2010-06-15 00:00:00 (Updated 2010-06-17 03:56:52) from Conrad Parker