Gentoo’s Quality Status

So I have complained about the fact that we had a libpng 1.4 upgrade fallout that we could have for the most part avoided by using --as-needed (which is now finally available by default in the profiles!) and starting much earlier to remove the .la files that I have been writing for so long about.

I also had to quickly post about Python breakage — and all be glad that I noticed by pure chance, and Brian was around to find a decent fix to trickle down to users as fast as possible.

And today we’re back with a broken, stable Python caused by the blind backport of over 200k of patch from the Python upstream repository.

And in all this, the QA team only seem to have myself complaining aloud about the problem. While I started writing a personal QA manifesto, so that I could officially requests new election on the public gentoo-qa mailing list, I’m currently finding it hard to complete; Mark refused to call the elections himself, and noone else in the QA team seems to think that we’re being too “delicate”.

But again, all the fuck-ups with Gentoo and Python could have been seen from a mile away; we’ve had a number of eclass changes, some of which broke compatibility, packages trying to force users into using the 3.x version of Python that even upstream considers not yet ready for production use, and an absolute rejection of working together with others. Should we have been surprised that sooner or later shit would hit the fan?

So we’re now looking for a new Python team to pick up the situation and fix the problem, which will require more (precious) Tinderbox time to make sure we can pull this without risking breaking more stuff in the middle of it. And as someone already said to me, whoever is going to pick-up Python again, will have at their hand the need to replace the “new” Python packaging framework with something more similar to what the Ruby team has been doing with Ruby NG — which could have actually been done once and for all before this..

Now, thankfully there are positive situations; one is the --as-needed entering the defaults, if not yet as strongly as I’d liked it to be; another is Alexis and Samuli asking me specifically for OCAML and XFCE tinderbox runs to identify problems beforehand, and now Brian with the new Python revision.

Markos also is trying to stir up awareness about the lack of respect for the LDFLAGS variable; since the profile sets --as-needed in the variable, you end up ignoring that if you ignore the variable. (My method of using GCC_SPECS is actually sidestepping that problem entirely.) I’ve opened a bunch of bugs on the subject today as I added the test on the tinderbox; it’s going to be tricky, because at least the Ruby packages (most of them at least) respect the flags set on the Ruby implementation build, rather than on their own, as it’s saved in a configuration file. This is a long-standing problem and not limited to Ruby, actually. I’ve been manually getting around the problem on some extensions such as eventmachine but it’s tricky to solve in a general way.

And this is without adding further problems as that pointed out by Kay and Eray that I could have found before if I had more time to work on my linking collision script — it is designed to just find those error cases, but it needs lot of boring manual lookup to identify the issues.

Now, I’d like to be able to do more about this, but as you can guess, it already eats up enough of my time that I have even trouble fitting in enough work to cover the costs of running it (Yamato is not really cheap to work on, it’s power-hungry, has crunched a couple of hard-disks already, and needs a constant flow of network data to work clearly, and this is without adding the time I pour into it to keep it working as intended). Given these points, I’m actually going to make a request if somebody can get either one of two particular pieces of hardware to me: either another 16GB of Crucial CT2KIT51272AB667 memory (it’s Registered ECC memory) or a Gigabyte i-RAM (or anything equivalent; I’ve been pointed at the ACard’s ANS-9010 as an alternative) device with 8 or 16GB (or more, but that much is good already) of memory. Either option would allow me to build on a RAM-based device which would thus reduce the build time, and make it possible to run many many more tests.