One of the issues that I’m trying to tackle with my tinderbox is that we have a varying degree of control among different ebuilds. This is one thing that I think is a major problem in Gentoo: while a lot of users are brought to us by the idea of being able to choose the flags to use for build the software, we are lately slowing down on that as an issue. Not only packages start to feature custom-cflags USE flags (or custom-cxxflags for the Qt packages), but we also strip, filter and randomly mangle flags.
Now, of course there are quite a few compiler flags that we don’t want users to enable, but as Mark has been repeating over and over and over is that if any flag breaks a package which is not intended to, then we should be tackling the issue on the compiler level, fixing that. And on the other hand, I wouldn’t care if users using silly flags get broken software. As for the idea that upstream will not support our users… well they shouldn’t, to begin with; problems should first filter through us; if we had enough people to work on the issues at least.
But even skipping over the flags there are other issues: USE flags, debug information, installation paths, slotting, alternative software and so on. As David said in a previous post there is no way we can test all situations beforehand, even if it’d be quite easier for our users. While binary distributions have a limited setup system which can be tested somewhat easily, there is an infinite amount of variation in Gentoo systems which makes it much more difficult to identify all the issues beforehand (and this is even without factoring in the Gentoo/Alt project, with Gentoo/FreeBSD and the prefix support!).
I can repeat at every post that the key for proper software is testing but this is not going to work when there are so many packages failing tests, with bugs open, and nobody looking at them. I am culprit of this too, there are quite a few packages that I maintain for which I don’t run all the tests properly and I have never finished my uif2iso testsuite which I started working on almost six months ago! We should really start to reject stable for packages failing tests, and bumping the priority of test failure for packages that are stable already.
Of course, it might well be that upstream doesn’t test enough pieces already, and that something in the environment will break their software; shit happens, we can track it down, and upstream can add further tests to make sure this does not happen again! I’m sure that lots of developers do like this idea. And reading Eric’s interview I guess that RedHat and Fedora are working on making use of automated tests more. Why shouldn’t we?
Okay this is one post I have written instead of sleeping, again, at least I have been watching Bill Maher .. love that talk show!