While looking at today’s logs I started to wonder whether the current approach is actually sustainable. The main problem is that most of the tinderbox was just created piece by piece as needed to complete a specific task, and it really has never undergone a complete overhaul.
Right before moving to containers it only consisted of a chroot, which caused quite a few issues on the long run, but not extreme issues (still some of the important parts, such as the fact that libvirt testsuite seem to be triggering something bad in my kernel, which in turn blocks evolution and gwibber).
Beside the change from a simple chroot to a container (that reminds a lot of BSD jails by the way), the rest ofthe tinderbox has been kept almost identical: there is a script, written by Zac, that produces a list of packages, following these few rules:
- the package is the highest version available for a slot of a package (it lists all slots for all packages);
- the package is not masked, nor its dependencies are (this is important because I do mask stuff locally when I know ti fails, for instance libvirt above);
- the ebuild in that particular version is not installed, or it wasn’t re-installed in the past 10 weeks (originally, six weeks, now ten because tests need to be executed as well).
This is the main core of the tinderbox, and beside that there is a mostly-simple bashrc
file for Portage, that takes care of a few more tests, that portage itself ignores for now:
- check for calls to
AC_CANONICAL_TARGET
, this is a personal beef against overcanonicalisation in autotools; - check for bundled common libraries (zlib, libpng, jpeg, ffmpeg, …), this makes it easier to identify possibly bundled libraries, although it has a quite high rate of false positives; having an actual database of code present in the various package would be easier;
- check for use of insecure functions; this is just a little extra check for functions like
tmpnam()
and similar, that ld already warns about; - single-pass
find(1)
checks for OSX forkfiles (unfortunately common in the Ruby packages!), for setuid and setgid binaries (to have a list of them) and for invalid directories (like/usr/X11R6
) in the packages; - extra QA trick to identify packages calling
make
rather thanemake
(which turns out to be quite useful to identify packages that usemake
instead ofemake -j1
to hide bugs.
Now, all this produces a few files in the temporary directory, which are then printed so that the actual build log keeps them. Unfortunately this does not really work tremendously well since it requires quite a bit of work to properly extract them.
So I was thinking, what if instead of just running emerge, I run a wrapper around emerge? This would actually help me gathering important information, at the cost of spending more time for each merge. At that point, the wrapper could be writing up a “report card” after each emerge, containing in a suitable format (XML comes to mind) the following information:
- status of the emerge (completed or not);
- build log of the emerge (filename, gets copied with the report card itself);
- emerge info for the current merge (likewise) — right now the emerge info I provide is the generic information of the tinderbox, which might differ from the specific instance used by the merge;
- version of all the dependency tree — likewise, it’s something that might change between the time the package fails and I file the bug;
- CVS revision of the current ebuild — very important to debug possibly fixed and duplicated bugs; this is one of the crucial point to be considering before moving to git where such revision is not available, as far as I know;
- if the package has failed within functions we know they leave a log with details, those files should be copied together with the report card, and listed — this works for
epatch
, the functions fromautotools.eclass
andeconf
; - gathering of all the important information as returned by the checks in bashrc and in Portage proper (prestripped files, setXid files, use of
make
andAC_CANONICAL_TARGET
, …).
Once this report card is generated, the information from the workdir is pretty unimportant and can mostly cleared up, this would allow me to save space on disk, which isn’t exactly free.
More importantly, with this report it’s also possible to check for bugs, and report them almost automatically, which would reduce the time needed for handling the reports of the tinderbox, and thus the time I need to invest on the reporting side of the job, instead of spending it on the fixing side (which thankfully is currently handled very well by Samuli and Victor).
More on this idea in the next days hopefully, with some proof of concept as well.