Bigger, better tinderbox

Well, not in the hardware sense, not yet at least, even though it’d be wicked to have an even faster box here (with some due control of course, but I’ll get back to that later). I’ll probably get some more memory and AMD Istanbul CPUs when I’ll have some cash surplus — which might not be soon.

Thanks to Zac, and his irreplaceable help, Portage gained a few new features that made my life as “the tinderbox runner” much easier: collision detection now is saved in the build/merge log, this way I can grep for them as well as for the failures; the die hook is now smarter, working even in case of failures coming from the Pythons side of Portage (like collisions) and it’s accompanied by a success hook. The two hooks are what I’m using for posting to identi.ca the whole coming of the tinderbox (so you can follow that account if you feel like being “spammed” by the proceeding of the tinderbox — the tags allow to have quick look of how the average is).

But it’s not just that; if you remember me looking for a run control for the tinderbox, I’ve implemented one of the features I talked about in that post even without any fancy, complex application: when a merge fails, the die hook masks the failed package (the exact revision), and this has some very useful domino effects. The first is that the same exact package version can only ever fail once in the same tinderbox run (I cannot count the times my tinderbox wasted time rebuilding stuff like mplayer, asterisk or boost and failing, as they are dependencies of other packages), and that’s what I was planning for; what I had instead is even more interesting.

While the tinderbox already runs in “keep going mode” (which means that a failed, optional build will not cause the whole request to be dropped, and applies mostly to package updates), by masking specific, failing revisions of some packages, it also happens to force downgrades, or stop updates, of the involved packages, which means that more code is getting tested (and sometimes it gets luckier as older versions build where newer don’t). Of course the masking does not happen when the failure is in the tests, as those are quite messed up and warrant a post by themselves.

Unfortunately I’m now wondering how taxing the whole tinderbox process is getting: in the tree there are just shy of 14 thousands packages. Of these, some will merge in about three minutes (this is back-to-back from call to emerge to end of the process; I found nothing going faster than that), and some rare ones, like Berkeley DB 4.8, will take over a day to complete their tests (db-4.8 took 25 hours, no kidding). Accepting an average of half an hour per package, this brings us to 7 thousands hours, 300 days, almost an year. Given that the tinderbox is currently set to re-merge the same package over a 10 weeks schedule, this definitely gets problematic. I sincerely hope the average is more like 10 minutes, even thought that will still mean an infinite rebuild. I’ll probably have to find the real average looking through the total emerge log, and at the same time I’ll have to probably reduce the rebuild frequency.

Again, the main problem gets to be with parallel make: I’d be waiting for the load of the system to be pretty high while on the other hand it’s really left always under the value of 3. Quite a few build systems, including Haskell extensions’, Python’s setuptools, and similar does not seem to support parallel build (in case of setuptools, it seems that it calls make directly, so ignoring Gentoo’s emake wrapper), and quite a few packages force serial make (-j1) anyway.

And a note here: you cannot be sure that calling emake will give you parallel make; beside the already-discussed “jobserver unavailable” problem, there is the .NOTPARALLEL directive that instructs GNU make to not build in parallel even though the user asked -j14. I guess this is going to one further thing to look for when I’ll start with the idea of distributed static analysis.

4 thoughts on “Bigger, better tinderbox

  1. For the .NOTPARALLEL problem, what about patching GNU make to print a diagnostic on stderr when that directive blocks parallelism, so you can find such packages automatically as they are hit? This should be easy to do, and could even be extended to allow an environment variable to force GNU make to ignore the directive, similar to how you are already using the make-as-function trick to parallelize ebuilds that are normally not parallel.

    Like

  2. Uhm, isn’t a fast machine to run tests some type of infrastructure for which Gentoo as an organization might specifically call for donations?

    Like

  3. @Radtoo – I’m not a programmer nor would I ever speak for Diego, but I do have a few comments which might help clarify some points.Diego has described his hardware {Yamato} in some prior posts and it’s rather high end for a single user. Roughly speaking, if I recall his description, it’s between 3 and 4 times more capable than what I run. I have a AMD Phenom II 9600 (4 cores) with 8 gig ram. I’m not set up for RAID nor do I use any flavor SCSI {both used in Yamato}. Yamato is really quite the beast.While Diego has touched upon the .NOTPARALLEL and -j1 issue, the problem has a few more parts than that.Over the last few months, for various reasons, I’ve run an “emerge -e @world” several times. I’ve also run on a second console “top -d10” in order to monitor progress. These are the pertinent observations.1) The parallelism of the emerge process is only available during the actual compiling of code. The unpacking, ‘make’ tests, moving of binaries, stripping, rebuilding of ld and any other setup requirements are all done linearly. This means that there is a absolute minimum of time spent per package where only a single processor is used. You can see this on any multiprocessor system simply by watching what’s taking place and watching what ‘top’ reports regarding user CPU utilisation. On a 4 core system like mine with only “emerge” running from CLI, “top” reprts no more than ~25% user CPU utilisation for all non-compile parts of an ebuild.2) I was surprised and dismayed at the number of packages which do not make use of parallelism during the compile phase. Packages which do make use of -j greater than one will show percentages roughly equal to even breaks by number of cores in use. i.e. ~25%, ~50%, ~75% to a maximum of ~97% user CPU utilisation.Assuming that my system represents a fairly common mix of packages, my observations suggest that between a third and half of all common packages do not make use of parallelism for the compile stages. This estimate may be high because for small packages, parallelism simply may not have a chance to kick in or it may be too short to see in my reporting window.Since I’m not a programmer, I can’t confirm that for you by parsing the relevant ebuilds. On the other hand, “top” makes it pretty clear when more than one core is in use. For many packages, the biggest user of wall clock time isn’t the compiling phase, it’s the “make” test phase and other linear portions of an ebuild. Consider the call to rebuild kde database for every single kde based package and the equivalent call for every gnome based package. All these add to the total time and the are all linear in execution.The reality is that for smaller packages that fit within available memory, the linear portions of a given ebuild are not substantially faster on my 4 core Phenom II 9600 than they were on my original single core Athlon 3000.When parallelism kicks in, then yes, I see a real drop in wall clock time. But that doesn’t happen as often as I would have hoped when one is emerging the world.

    Like

  4. @Radtoo Gentoo’s infrastructure is currently dedicated at other tasks, and I’d sincerely rather see them working on providing a stable files archive rather than tinderboxing.@dufeu Actually, Yamato is not using SCSI and it just uses RAID-1 for the data to avoid disk failures, but not for performance reason (it would be software raid anyway so it wouldn’t really help performances). But you’re right about linearity problems that ensure.As for cheating @.NOTPARALLEL@, the @make@ cheat is there to find where @emake@ could be used, I’m not going to cheat @emake -j1@, but I am assuming that most @make@ calls are mistaken, and should rather be @emake@ (and this is indeed true, a lot of calls to non-parallel make install work flawlessly with parallel make).

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s