Why the tinderbox is a non-distributed effort

In my previous post about a run control for the tinderbox, Pavel suggested me to use Gearman to handle the run control section; as I explained him earlier this evening, this is probably too much right now; the basic run control I need is pretty simple (I can even keep using xargs, if Portage gave me hooks for merge succeeded/failed), the fancy stuff, the network interface, is sugar that Gearman wouldn’t help me with as far as I can see.

On the other hand, what I want to talk about is the reasoning why I don’t think the tinderbox should e a distributed effort, as many people try to suggest from time to time to reduce the load on my machine. Unfortunately to work well in distributed methods, the task has to feasibly become a “divide et impera” kind of task.

The whole point of the tinderbox for me is to verify the interactions between packages; it’s meant to find which packages break when they are used together, among other things, and that kind of things need for all the packages to be present at the same time, which precludes the use of a distributed tinderbox.

If you don’t think this is helpful, I can tell you quite a bit of interesting things about automagic deps but since I already wrote about them from time to time I’ll skip over it for now.

That kind of effort that can work with the distributed approach is that taken by Patrick of cleaning-up tinderboxes: after each merge the dependencies gets removed, and a library of binary packages is kept up to date to avoid building them multiple times a day. This obviously makes it possible to test multiple package chains at once in multiple systems, but it also adds some further overhead (as multiple boxes will have to rebuild the same binary packages if you don’t share them around).

On the other hand, I think I got an use for Gearman (an ebuild for which, mostly contributed by Pavel, is in my overlay; we’re working on it to polish): I already mused some time ago about checking the packages’ sources looking for at least those things that can be found easily via scripts (like over-canonicalisation that I well documented already). This is a task where divide-et-impera is very likely a working strategy. Extracting and analysing the sources is an I/O-bound task, not a CPU-bound task, so Yamato’s approach there is definitely a losing one.

To have a single box to have enough I/O speed to handle so many packages you end up resorting to very high end hardware (disks and controllers) which is very expensive. Way too expensive. On the other hand, having multiple boxes, even cheap or virtual (distributed among different real boxes of course) working independently but dividing their queue together, with proper coordination, you probably can beat those performances for less than half the price. Now, for this to work there are many prerequisites, a lot of which I’m afraid I won’t be able to tackle anytime soon yet.

First of all, I need to understand well how Gearman work since I only skimmed through it up to now. Then I need to find the hardware; if I can change my garage into a machine room, and connect it to my network, that might be a good place to start (I can easily use low-power old-style machines, I still have a few around that hadn’t found space to be put lately); I remember some users offering chroots in their boxes before; this might turn out pretty useful, if they can make virtual machines, or containers, they can also work on the analysis, in a totally distributed fashion).

The third problem is somewhat the hardest but the most interesting: finding more analysis to run on the sources; without building stuff. Thankfully, I have got the book (Secure Programming with Static Analysis) to help me coping with that task.

Wish me luck, and especially wish me to find time to work on this.