Debunking ccache myths redux

Since my original post from two years ago didn’t reach yet all the users, and some of the developers as well, I would like to reiterate that you should not be enabling ccache unconditionally.

It seems like our own (Gentoo’s) documentation is still reporting that using ccache makes build “10 to 5 times faster”. I’ll call this statement for what it is: bullshit. The rebuild of the same package might have such a hit, but not the normal emerge process of a standard user with Gentoo. If anything at all, the use of ccache will slow your build down, and even add further failure cases and make it difficult to identify errors.

Now, since the approach last time might not have been clear enough, let me try a different one, by describing the steps it takes when you call it:

  • it has to parse the commandline to make sure you’re calling it for a single compile, it won’t do any good if you’re using it to link, or to build multiple source files at once (you can, especially if you use -fwhole-program, but that’s for another day to write about), so in those cases, the command is passed through to the compiler itself;
  • once it knows that it’s doing a single compile, it changes the call to the compiler so that instead it simply preprocess the file, and stores the result in a temporary area;
  • now it’s time to hash the data, with md4 (the parent of MD5), that as the man page suggests is a strong hash; this has good reasons to be strong, but it also means that it takes some time to hash the content; we’re not talking about the source files themselves, that are usually very small and thus quick to hash, but rather of the preprocessed file, which includes all the headers used… a quick example on my system, by just including eight common header files, produces a 120KB output (with -O2 and _FORTIFY_SOURCE… it goes down to 93KB if -O0 is used); to that add the extra information that ccache has to save (check the man pages for those);
  • now it has to search the filesystem, within its cache directory, if there is a file with the same md4; if there is, it gets either copied (or experimentally hardlinked, but let’s not go there for now), otherwise the preprocessed file is compiled and copied in the cache instead; in either case, it involves copying the object file from one side to the other.

Now, we can identify three main time-consuming operations: preprocessing, hashing and copying; all of them are executed whether this is a hit or a miss; if it’s a miss you add to that the actual build. How do they fare about the kind of resources used? Hashing, just like compiling, is a CPU-intensive operation; preprocessing is mixed (you got to read the header files from around the disk); copying is I/O-intensive. Given that nowadays most systems have multiple CPU and find themselves slowing down on I/O (the tinderbox taught me that the hard way), the copying of files around is going to slow down the build quite a bit. Even more so when the hit-to-miss ration is high. The tinderbox, when rebuilding the same failing packages over and over again (before I started masking the packages that failed at any given time), had a 40% hit-to-miss ratio and was slowed down by using ccache.

Now, as I already wrote, there is no reason to expect that the same exact code is going to be rebuilt so often on a normal Gentoo system… even if minor updates to the same package were to share most of the source code (without touching the internal header files), for ccache to work you’d have to leave untouched compiler, flags, and all the headers of all the dependent libraries… and this often includes the system header files from linux-headers. And even if all these conditions were to hold true, you’d have to have rebuilt object files for a total size smaller than the cache size, in-between, or the objects would have had expired. If you think that 2GB is a lot, think again, if you were to use -ggdb especially.

Okay now there are some cases where you might care about ccache because you are rebuilding the same package; that includes patch-testing and live ebuilds. In these cases you should not simply set FEATURES=ccache, but you can instead make use of the per-package environment files. You can then choose two options: you can do what Portage does (setting PATH so that the ccache wrappers are found before the compilers themselves) or you can simply re-set the CC variable, such as export CC="ccache gcc". Just set it in /etc/portage/env/$CATEGORY/$PN and you’re done.

Now it would be nice if our terrific Documentation team – instead of deciding once again (the last time was with respect to alsa-drivers) that they know better what the developers should support – would understand that stating in the handbook that ccache somehow magically makes normal updates “5 to 10 times faster” is foolish and should be avoided. Unfortunately upon my request the answer hasn’t been what you’d expect from logic.