From scratch: about my tinderbox

I know that quite a bit of people are still unsure on what a tinderbox is, on what it does, and especially on why they should care enough that I’m bothering them every other day about it on my blog (and by extension on Planet Gentoo and the Gentoo.org homepage which syndicate it). So I’m now trying to write a long-breathed article that hopefully will cover all the important details.

What is “a” tinderbox

The term tinderbox, as reported by Wikipedia has been extended to the general use starting from the software, developed by Mozilla, designed to perform continuous integration over a given software project. This doesn’t really help that much because the term “continuous integration“ might still not say anything to users. So I’ll try to explain that from scratch as well.

Gentoo users, especially those running the unstable branch, are unfortunately used to breakage caused by build failures after a library update or other environment changes like that. This problem is increasingly present also during the development phase of any complex software. It gets even worse because while developers might not see any build failure on their machine, they can introduce “buggy“ code that breaks on others, as not all distributions have the same environment, the same libraries, the same filesystem layout and so on so forth.

And this is just by limiting the scope of our consideration to a single operating system – Linux in this case – and not considering the further problems caused by the presence of other software platforms, such as FreeBSD (which is still a Unix system), Mac OS X, and Windows. This is one of the biggest problems with multi-platform development that make it definitely non-trivial to write software that works correctly between these systems. I could write more about that topic but I’ll leave that to another time.

To make it possible for the developers not to have to manually get their code on multiple machines with various combination of operating systems and versions to ensure that the code does build properly after every change is merged, the smart approach was to invent the concept of continuous integration. To sum it up, think about what I just said – getting the code on multiple machines, with different operating systems, or versions of operating systems, build and eventually execute it – but done by a software rather than having an human doing it.

Mozilla’s software for continuous integration is called Tinderbox and for whatever reason, the name has stuck in Gentoo to refer to the various tries we have had at setting up some continuous integration testing. I’m not sure on the reasons why the name stuck in the Gentoo development team, but so it is, and so I’m keeping it.

Why is it “my” tinderbox

As I just said, Gentoo has had in the past more than one try at making continuous integration a major player in its development model. I think at least four people separately have worked on various “tinderboxes“ in the past five years, both as proper Gentoo projects and as outside, autonomous efforts. So I cannot call it “the Tinderbox”, but rather “my tinderbox”, to qualify which one among those that were and are.

Right now, we have three actively maintainer continuous integration efforts called tinderboxes: my own is the latest one joining the party, before that we already had Patrick’s and before that the Gentoo tinderbox which runs on proper Gentoo infrastructure boxes.

As I already rambled on about this “duplication“ of efforts is not a problem, as the three current systems have different procedures, and different targets. In particular, the “infra“ Tinderbox is mostly there to ensure the packages in the system set work for most major architectures, to provide reverse dependency information, and to make available binary packages useful for system recovery in case of major screw-up. On the other hand, me and Patrick are using our tinderboxes to identify bugs and report them to developers who might have not known about them otherwise.

What my Tinderbox *does*

I’ve written more or less how it works although that was really going in the detail of what I do to have it working. For what concern users, those details are mostly unimportant.

What my Tinderbox does, put it simply, is building the whole Portage tree, at least that’s the theoretical target. Unfortunately, as many noted before, it’s a target that no one box in the world can likely reach in our lifetime, let alone continuously. Continuous integration tools are designed to take away from the developers the time-consuming task of trying their code on different platforms (software or hardware, it doesn’t matter); even though I’m now limiting the scope of them to testing Gentoo, the possible field of variations is limitless.

Not only we support at least two operating systems (Linux and FreeBSD), a number of architectures (over a dozen), two branches (stable and unstable), but for the very essence of Gentoo, each installed system is never identical to another one: USE flags, CFLAGS, package sets, accepted packages from the unstable branches, overlays… Testing all these variables is impossible, so my tinderbox reduces (a lot) the scope, while still trying to be as broadly applicable as possible.

It runs over the unstable x86-keyworded branch of the tree (~x86 keyword), along with some packages that are not available by default and are rather unmasked (mostly toolchain-related). It uses vary bland CFLAGS, forced --as-needed and some common USE flag setup. It runs tests but it doesn’t let their failure stop the merge. It enables USE flag (for dependencies) as needed to have as many packages installed as possible.

The big part of the work is done by a single loop that tries to install, one by one, all the packages that haven’t been tested in the past two and a half months. Portage is left to remove conflicting packages as it finds fit. Sometimes packages are rejected for various reasons (they collide with another package, there is a conflict in the needed USE flag settings between different packages, or they require very old software versions that would in turn cause other problems), so instead of having the full Portage tree available for ~x86, I can only get a subset, the biggest subset I can at least.

The results

Of course, to understand why the Tinderbox is an important project, and why I’m investing so much time on it, and looking for other people to invest part of their time (or resources) to help me, is necessary to look at what the by-products of its execution are.

I’m not making my binpkgs available to the public (like the infra-based tinderbox has), mostly because I’m not allowed to: firstly I don’t enable the bindist USE flag, which means I’m allowing build configurations that are non-redistributable, because of licenses or trademarks, secondly because that’s not my target at all, and they may very well not be functional, in some parts.

The accomplishments coming from my efforts are, generally speaking, bug reports. Not only I scan through the failure logs for those ebuilds that fail to build in this environment, but I also scan for failing test procedures, and other similar quality issues. But again, this is not the result final users are looking for, as bug reports don’t mean anything for them.

Results that are good for users are, instead, the fixes for these bug reports: you can be safe to have documentation installed in the right place where you expect it; you can be safe that if you need to debug a problem, you only need to follow the guide without the build system of the package coming in your way; you can be safe that a package that will report itself as installed is indeed installed and it does work.

While, obviously, there are still problems that my Tinderbox effort won’t be finding anytime soon (for instance the problem of missing dependencies, which is instead the speciality of Patrick’s Tinderbox), and test procedures are rarely that well thought, designed and realised that they don’t leave possible runtime problems to be found, I think that keeping it running, continuously integrating as many ebuilds as it can, is definitely going to be of help for Gentoo’s general quality and stability.

And there is another thing that users are most likely interested in: testing of the future toolchain. As I said before, beside using the ~x86 visible ebuilds, my Tinderbox is also running some masked packages, which include, nowadays, gcc-4.4 and autoconf-2.65. Using these versions, not yet available to the general public of users, allows me to find failure cases related to them before they are made available. This serves the double role of assessing the impact that any given change might have on the users (by knowing how many packages will be broken if the package is unmasked), and correcting those problems before they have the chance of wasting users’ time.

Helping

Continuous integration can’t be said to be a “boring” task, in the sense that every day you might stumble into a new problem, for which you’ll have to devise a solution. On the other hand, it’s an heavy duty, for both the machines performing the job and the people that have to look through the logs. Even the sole task of filing the bugs is time-consuming, and that’s without adding the problems dealing with the human factor which often take you much more time.

Furthermore there are a number of practical issues tied into running this kind of continuous integration: the time needed to build packages conditions the amount of packages they can test over a given time frame, and unfortunately to reduce that time often you have to push for higher end machines (a lot of the work is I/O bound, and not all the software can build or execute tests in parallel), which tend to have high price tags, as well as consuming more power (and thus costing more over the long term) than your average machine.

For this reason both me and Patrick tend to ask for material help: we are taking the costs of these efforts directly.

It has been suggested a number of time to host the Tinderboxes over Gentoo infrastructure hardware, but there have been problems before about this. On the other hand, I think that the way I’m currently proceeding, for my effort at least, could open the possibility for that to happen. The one big issue for which I don’t really feel comfortable with running the tinderbox remotely is easy access to the logs for identifying problems and, especially, to attach them to bug reports. I’m currently thinking how to close that gap but it needs another kind of help in the form of available developers.

And one kind of help that costs very little on users, but can help a lot our work is commenting (and criticising if needed) our design choices, our code, our methods. Constructive criticism of course: telling us that our method sucks and not giving any suggestion on how to make them better is not going to help anybody, and can actually lower our morale quite a bit.

So please, the greatest favour you can do me is continuing to listen my ramblings about QA, tests, Tinderbox and log analysis; and if you are still unsure how something works in this system, feel free to ask.

13 thoughts on “From scratch: about my tinderbox

  1. Diego, thanks for explaining us from scratch what’s a Tinderbox. A lot of doubts were solved for me.Also, thanks for been one of those who keep Gentoo working. I’ll try to share some money with you guys as soon as I can.Please tell if there is a way we can do that.Greetings!

    Like

  2. As Cortex said, thank you so much for the explanation. It was excellent!You’re doing good things, and I always look forward to reading your entries in Planet Gentoo because they let me know a little bit more about the internal structure of Gentoo, and your work of detecting and reporting bugs!

    Like

  3. i think your QA and tinderbox is very important. i know its not an easy job, i am still figuring out arch testing :Pi am a user of gentoo prefix and think tinderbox will be very important as gentoo expands.

    Like

  4. Should are profiles and tinderboxes synonymous terms? Do tinderboxes exist for all profiles?I should like to see a VPS profile that included hardened support. Best case options to install as the host system. I am not however making money in that field and cannot send to support such.A profile with some nice energy usage for the recent AMD64s with virtualization support would potentially have some small affect on Gentoo’s carbon footprint ;)Anyway Happy Holidays

    Like

  5. “…that I’m bothering them every other day about it…” bothering is the wrong word might :DI’ve been reading your posts for quite a while and they are very interesting. I’d love to help, unfortunately through, I don’t have any money to spare. My programming knowledge is limited too, but maybe I’ll be able to help with that, and I’ve got a few CPU cycles to spare. So if you got any idea of how I can help, mail me. And don’t ever think you’re bothering people, you’re doing a great job, and reading your blog entries is everything but boring :D

    Like

  6. Just for the record: having said that most of the problem is I/O bound, have you thought about SSDs? As far as I know, if you can fit your hot data on them you can really make a difference with just one of the decent ones (or even a couple in Raid0, which would really tax the busses of the usual domestic motherboards!).I don’t really know, though, whether 80G (example of the cheapest intel one, around 200$) would fit the bill (perhaps keeping /usr/portage separate? Although it would help to know *where* are the I/O intensive operations located; I’m guessing /var/log and /var/tmp/portage… ).Just in case it helps ;).

    Like

  7. @cortex: on the home page of this site there are all the references for donating; please donate if you can, thank you.@user99: no, *profiles* and *tinderboxes* are not synonymous terms, they are two completely different things.@ Jisakiel: the SSD are not a viable solution: actually Diego requires free space in the order of Terabyte, so the SSD will make the costs of the tinderbox not sustainable (unless you want to donate four SSD of 1TB each one to help the I/O tinderbox performance).

    Like

  8. Actually, 80GB would be enough to keep the @/var/tmp@ tree, the problem is how much they’d last in that environment. Most of the work is done there anyway, although the container itself has its own share of I/O to load the files, the includes, the libraries and so on. While SSD could likely improve the situation, I don’t have any more SATA ports to spare to connect even one (I need disk space not only for the tinderbox but for work stuff and virtual machines, including G/FBSD ones).Just to show how much I/O is a problem, it takes *less* time to build with ccache disabled than enabled: when it’s enabled it has to access the cache directory to find whether the file is cached or not, and if it isn’t, it has to write it down twice. Given that I added ccache because the same package was being built multiple times when it failed (which is since solved), I don’t get anywhere near as many hits to make it useful._And on a different note: combining ccache with memcache would *probably* get some results: the results would be stored in memory rather than on disk, and make it much faster to access; on the other hand that meant that you would end up using much more memory. It’s only good when you’re doing build of multiple branches, multiple configurations of the same package, or testing one package in particular, which is something developers do often times._

    Like

  9. I suppose I should be less cryptic so you could understand my intent. I understand they are different things….I was suggestiong that perhaps a ‘goal’ should be a tinderbox for each profile

    Like

  10. I use to keep /var/tmp/ on a separate partition…in my experience if OpenOffice will compile the partition or disk is big enough …but it can grow HUGE if you do not clean out /work/pkg-name/* so no small drives

    Like

  11. The problem is which kind of tinderbox we’re talking about there; if you mean a tinderbox like mine per profile, it’d be nice already if we were able to get two tinderboxes, one for x86 and one for amd64 (arches, not profiles in this case); the number of profiles they’d involve would already be impossible to maintain, but even two of these aren’t that easy to come by.My reasoning to use x86 for the tinderbox is that it hits problems that I would never hit myself (running amd64). Having another box similar to Yamato (Enterprise-D?), and not worrying for the power bill, I’d be happy to set up two parallel tinderboxes to run the two arches.Unfortunately, while I can feign ignorance of the expenses related to running this one tinderbox, running two would definitely start to get me asking myself whether it’s worth it.As I “wrote earlier”:https://blog.flameeyes.eu/2… the tinderbox cannot feasibly be distributed. At the same time the best choice would be to locate multiple tinderboxes in the same physical network: this way you would be able to share portage tree, distfiles, and log analysis between them.So if people wanted multiple tinderboxes to make it possible to catch problems much earlier, money has to be added to the consideration: there are costs involved which include the hardware (and maintenance), the power costs, the man-hours spent on handling the tinderbox and its software. It’s not a cheap process. I haven’t even considered quantifying it for now, because I’d feel a bit silly.The price of dedicated servers from “OVH”:http://www.ovh.com/ (which is the one I had at hand to check out now) for a system that is almost identical to mine (for what concerns CPU and memory, it has smaller drives but it doesn’t matter) is of €150 (included VAT) per month. Let’s assume that you only need two of those to run two tinderboxes for x86 and amd64, running Gentoo on the host as well as an LXC guest, with one of the two handling the storage of the tree and distfiles, and the other running the log analysis. In such a configuration, ignoring the man-hours, you get to pay €300/month, which is €3600 an year. I cannot afford that myself.Maybe I should start an annual fundraising with that sum as the target: if it’s reached in a given year, then I could get (and keep) the two boxes and set up the tinderbox running; if it’s not then it’s just dropped.But the first step toward this is to get “the log analysis software”:https://blog.flameeyes.eu/2… as it would decouple the logic of checking the logs from the system where the tinderbox is running (which is what happens right now).

    Like

  12. Perhaps it would be good in a future post to outline the hardware that would be required for a tinderbox. I’ll try and look back thru your blogs to see.Just before Xmas could have bought a dual XEON Dell PowerEdge for $150 cash…did not have it to spare. running box with Windows XP on it they said.

    Like

  13. This blog entry helped me a lot to understand the point of your many previous posts on the topic of Tinderbox. Thanks. If there’s any way I can donate CPU time to your project (on my cheap Gentoo server), let me know.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s