Even though I’ve heard Patrick complaining about this many times, I never would have been able to assess how much of the tree goes untested if I didn’t start my little own tinderbox. Now, I’m probably hitting more problems than Patrick because I’m running ~arch
(or mostly ~arch
) with --as-needed
enabled, but it still shows that there is a huge amount of stuff that needs to be fixed, or dropped.
Up to now I’ve been using GCC 4.1, and still hit build failures with it; now I’ve switched to GCC 4.3, even though the tracker shown a bad situation already; and of course there are packages that didn’t have bugs opened just yet, because nobody built them recently.
Still, supporting the new compilers is not my main concern sincerely; there are packages that won’t build with GCC 4.3 just yet, like VirtualBox, as there are packages that still don’t compile with GCC 4.0. What concerns me is that there is stuff that hasn’t been tested at all. For instance, sys-fs/diskdev_cmds
which was marked ~amd64
was totally broken, with fsck.hfs
causing a Segmentation Fault as soon as it was executed (the version that is now available works, the old one has been taken out of the keyworded tree).
Since even upstream sometimes fail, one should also take into consideration the packages’ tests, possibly ensuring their failures are meaningful, and not just “ah this would never work anyway”. If you check dev-ruby/ruby-prof
, the test suite is executed, but a specific test, which is known to fail, is taken out of it first. This is actually pretty important because it saves me from using RESTRICT to disable the whole testsuite, and executing the remaining tests helped me when new code was added to support rdtsc on AMD64, which was broken. The broken code never entered the tree, by the way.
Unfortunately doing a run with FEATURES=test enabled is probably going to waste my time since I expect a good part of the tree to fail with that; with some luck, if Zac implements me a softfail for tests, I’ll be able to do such a run in the next months. I wonder if the run this time will be faster, I’ve moved my chroots partition to use ext4(dev) rather than XFS, and it seems to be seriously faster. I guess once 2.6.28 is released I’ll move the rest of my XFS filesystems to ext4 (not my home directory yet though, which is ext3, nor the multimedia stuff that is going remain HFS+ for now).
My build run also has some extra bashrc tests, beside the ones I already written about, that is the checks for misplaced documentation and man pages. One checks for functions from the most common libraries (libz, libbz2, libexpat, libavcodec, libpng, libjpeg) that gets bundled in, to identify possibly bundled-in copies of those, another checks for the functions that are considered insecure and dangerous by libc itself (those for which the linker will emit a warning). It is interesting to see their output, although a bit scary.
Hopefully, the data that I gather and submit on the bugzilla for these builds will allow us to have a better, more stable, and more secure portage tree as the time goes by. And hopefully ext4 won’t fry my drives.
Meh, yes you have a point but we can’t/don’t/won’t do anything if people don’t file bugs. So, here we are, you or me add packages that we would like in the tree, then we leave the project and no one really cares for the package ever again. At that point, someone like you steps along and says, “hey, this is broke…bummer, let me fix it” then all is well again. So…meh.
Agreed, but sometimes I wish we (me included) would be much stricter on what we actually add to the tree, or add a keyword to. Like checking the testsuite, like contacting upstream about the various issues.I’ve been doing this much mroe lately than I’ve done in the past, I admit. For instance libarchive 2.5.903a has not been in the tree because of a bsdcpio failure (and 2.5.904a I’ve just noticed and going to test now). As I said, it saved my ass on ruby-prof, since I would have otherwise added a very broken version to the tree, if I was still using the gem (as you may have noticed, my ruby ebuild are switching _away_ from gems).So in general, true, we cannot do much for packages nobody uses, on the other hand if we were stricter to begin with maybe they wouldn’t rot too soon. Like testing them with the latest version of GCC available rather than the stable one, even if they are not supposed to go stable just yet.
Can you share somewhere your bashrc files, please ? And btw thanks for your work, it’s good to see some devs who are interested by QA.
…and for the case that everything works at time of initial commit then regresses as the toolchain evolves and no one knows/cares? For that we need automation and/or human involvement (read: bug reports).
True, but I’m sure that hfsplusutils and diskdev_cmds never worked fine on amd64, there is no way they would have since they expect bit-size integer types, yet the 32-bit type was mapped at “unsigned long” which is obviously wrong.I don’t want to fix just on those two packages, but like that I’m sure there are more pacakges in the tree. Hey, I’m sure I added some of those one way or another!The problem is a problem of framework: not everybody run FEATURES=test builds, and even when they do they cannot cover all the architectures, most obviously. So if a regression is added in a different architecture, it’s going to be unnoticed; if I were to test ruby-prof on x86, I would have never seen the problem with the new release for instance.And I agree, we need automation and human involvement, and also a new better software structure around us. The softfail of tests so that one can actually look at them is one of the changes I would make to the whole software framework. To that I’d add finally a packages.features file, rather than using bashrc hacks, so that one can enable the test feature just on a subset of packages; for instance I’m sure I don’t want to merge *my* packages with FEATURES=test turned off, but I want to merge almost all the rest of them with it off, in my workstation.Olivier, the bashrc is actually quite simple:<typo:code>post_src_install() { for invalid_dir in /usr/man /usr/info /usr/X11R6 /usr/doc /usr/locale; doif [[ -d “${D}”${invalid_dir} ]]; then ewarn “Flameeyes QA Warning! ${invalid_dir} present in image!” find “${D}”${invalid_dir} >/dev/stderrfi done rm -f “${T}”/flameeyes-scanelf-bundled.log for symbol in adler32 BZ2_decompress jpeg_mem_init XML_Parse avcodec_init png_get_libpng_ver; doscanelf -qRs +$symbol “${D}” >> “${T}”/flameeyes-scanelf-bundled.log done if [[ -s “${T}”/flameeyes-scanelf-bundled.log ]]; thenewarn “Flameeyes QA Warning! Possibly bundled libraries”cat “${T}”/flameeyes-scanelf-bundled.log fi rm -f “${T}”/flameeyes-scanelf-insecure.log for symbol in tmpnam tmpnam_r tempnam gets sigstack getpw getwd mktemp; doscanelf -qRs -$symbol “${D}” >> “${T}”/flameeyes-scanelf-insecure.log done if [[ -s “${T}”/flameeyes-scanelf-insecure.log ]]; thenewarn “Flameeyes QA Warning! Insecure functions used”cat “${T}”/flameeyes-scanelf-insecure.log fi}</typo:code>Even though it runs scanelf a number of times (like Portage does) thanks to the fact that it uses @mmap()@ and I have lots of RAM, it takes very little time after the first run to check all the rest.I’d use @scanelf@’s @-gs@ options but it wouldn’t report multiple symbols as it is, that’s why I do multiple runs. On the other hand once Ruby-Bombe will support reading from memory mapped files I’ll probably implement the thing using Ruby-Elf and a @rbel-symgrep@ tool.
As a long time Gentoo user, I’m really glad you’re poking around in all these corners.Hear is a hearty THANK YOU!
As Eric and others said above (mostly Eric) I want to thank you for starting QA and automated testing of builds.This will (if it grows) probably help Gentoo users get rid of the ever so common “emerge failed” error. Great work!
Thanks guys but I cannot take all the glory :)Patrick has been doing this for a long time for stable packages, although I guess most developers didn’t really take all of his bugs into consideration because of the tinderbox hitting corner cases.And Mark as QA lead is also doing an important policy job. I just try to put the problems to the attention of other developers.By the way a *huge* thank to Zac who sent me a Portage patch to implement the test softfail, I’ll try it out soon, although for now it’s still building without FEATURES=test.
FWIW I’ve been using ext4 for /usr/portage and /var/tmp for a month or so now without any problems.