I have already declared why a tinderbox is not enough but I think I should reprise this topic and write again why me, Patrick, and Ryan can’t find all the issues, even if we all put our efforts together.
The first problem is the sheer amount of combinations of packages: the different USE flags enabled, the different arches, the different order of merge, the different packages installed (or the way they are installed), all these differences combine to produce a way too high number of combinations to be able to test it in a lifetime. Of course we can probably find them most outstanding bugs quite quickly at a first pass with default USE flags; thanks to EAPI 2 and USE dependencies we can also have a decently clean track record of what needs to be enabled. On the other hand, it would be interesting to try disabling all the optional supports (but the ones that are strictly needed) and see how the ebuilds behave.
Then there is the problem with architectures, while in the past the architecture with the most keywords was x86, I”m not really sure if this is still true nowadays with the increment of amd64-based systems, I know I don’t usually keyword for ~x86 all the stuff I add to Portage, since I don’t run x86 anywhere. And while such packages probably can be compiled and used on x86, there are some that don’t, and there are issues that don’t apply to x86 at all but just other systems.
There are problems with packages that provide kernel modules, because they tend to break badly between one release and the other (myself I only help maintaining one of such packages nowadays, iscsitarget, and I’m usually good enough to get it to work properly a day or two after the new kernel is released – which means this weekend I’ll probably be doing another patch). I also had to blacklist a few packages that are only available for 2.4 kernels (why do we have a kernel 2.4 profile but none for 2.6? and why don’t we mask them on a profile level? no idea).
What about alternative packages? Collisions within packages create a bit of a problem when they are solved by blockers instead of allowing side-by-side install (and sometimes you don’t have any other choice than blocking one the other, see the two PAM providers). And there are still lots of packages that fail to merge (and thus in my case are re-added to the tinderbox build queue because they don’t result to be installed!) because of non-blocked collisions, sometimes for simply too-generic names.
Then there is the problem of overlays: my tinderbox can only check packages that are in the main tree (and not masked); all the packages in overlays gets ignored because they are, obviously, not added to the tinderbox. And the sheer amount of overlays make it likely impossible to deal with all of them. Let’s not even start thinking about the combinations that are added by different overlays added and the order in which they are added (which is one extra argument for not splitting our tree in multiple overlays!).
Are you not convinced yet? Well I really wish I had more convincing numbers, but I really don’t; I only know that the amount of work that my tinderbox effort – which is likely the less sophisticated one – is involving, is likely to be just a minuscule part of the effort needed to have a real quality control in Gentoo. And even though I can file, test, apply and close bugs, I cannot solve all the issues, because there are way too many variables in play.
Anyway, I’m taking a break now because my head is tremendously tired, and I’ve been filing bugs, working and scanning documents all day long. I could use some play time instead now…