TG4: Tinderbox Generation 4

Everybody’s a critic: the first comment I received when I showed other Gentoo developers my previous post about the tinderbox was a question on whether I would be using pkgcore for the new generation tinderbox. If you have understood what my blog post was about, you probably understand why I was not happy about such a question.

I thought the blog post made it very clear that my focus right now is not to change the way the tinderbox runs but the way the reporting pipeline works. This is the same problem as 2009: generating build logs is easy, sifting through them is not. At first I thought this was hard just for me, but the fact that GSoC attracted multiple people interested in doing continuous build, but not one interested in logmining showed me this is just a hard problem.

The approach I took last time, with what I’ll start calling TG3 (Tinderbox Generation 3), was to: highlight the error/warning messages; provide a list of build logs for which a problem was identified (without caring much for which kind of problem), and just showing up broken builds or broken tests in the interface. This was easy to build up, and to a point to use, but it had a lots of drawbacks.

Major drawbacks in that UI is that it relies on manual work to identify open bugs for the package (and thus make sure not to report duplicate bugs), and on my own memory not to report the same issue multiple time, if the bug was closed by some child as NEEDINFO.

I don’t have my graphic tablet with me to draw a mock of what I have in mind yet, but I can throw in some of the things I’ve been thinking of:

  • Being able to tell what problem or problems a particular build is about. It’s easy to tell whether a build log is just a build failure or a test failure, but what if instead it has three or four different warning conditions? Being able to tell which ones have been found and having a single-click bug filing system would be a good start.
  • Keep in mind the bugs filed against a package. This is important because sometimes a build log is just a repeat of something filed already; it may be that it failed multiple times since you started a reporting run, so it might be better to show that easily.
  • Related, it should collapse failures for packages so not to repeat the same package multiple times on the page. Say you look at the build failures every day or two, you don’t care if the same package failed 20 times, especially if the logs report the same error. Finding out whether the error messages are the same is tricky, but at least you can collapse the multiple logs in a single log per package, so you don’t need to skip it over and over again.
  • Again related, it should keep track of which logs have been read and which weren’t. It’s going to be tricky if the app is made multi-user, but at least a starting point needs to be there.
  • It should show the three most recent bugs open for the package (and a count of how many other open bugs) so that if the bug was filed by someone else, it does not need to be filed again. Bonus points for showing the few most recently reported closed bugs too.

You can tell already that this is a considerably more complex interface than the one I used before. I expect it’ll take some work with JavaScript at the very least, so I may end up doing it with AngularJS and Go mostly because that’s what I need to learn at work as well, don’t get me started. At least I don’t expect I’ll be doing it in Polymer but I won’t exclude that just yet.

Why do I spend this much time thinking and talking (and soon writing) about UI? Because I think this is the current bottleneck to scale up the amount of analysis of Gentoo’s quality. Running a tinderbox is getting cheaper — there are plenty of dedicated server offers that are considerably cheaper than what I paid for hosting Excelsior, let alone the initial investment in it. And this is without going to look again at the possible costs of running them on GCE or AWS at request.

Three years ago, my choice of a physical server in my hands was easier to justify than now, with 4-core HT servers with 48GB of RAM starting at €40/month — while I/O is still the limiting factor, with that much RAM it’s well possible to have one tinderbox building fully in tmpfs, and just run a separate server for a second instance, rather than sharing multiple instances.

And even if GCE/AWS instances that are charged for time running are not exactly interesting for continuous build systems, having a cloud image that can be instructed to start running a tinderbox with a fixed set of packages, say all the reverse dependencies of libav, would make it possible to run explicit tests for code that is known to be fragile, while not pausing the main tinderbox.

Finally, there are different ideas of how we should be testing packages: all options enabled, all options disabled, multilib or not, hardened or not, one package at a time, all packages together… they can all share the same exact logmining pipeline, as all it needs is the emerge --info output, and the log itself, which can have markers for known issues to look out for or not. And then you can build the packages however you desire, as long as you can submit them there.

Now my idea is not to just build this for myself and run analysis over all the people who want to submit the build logs, because that would be just about as crazy. But I think it would be okay to have a shared instance for Gentoo developers to submit build logs from their own personal instances, if they want to, and then have them look at their own accounts only. It’s not going to be my first target but I’ll keep that in mind when I start my mocks and implementations, because I think it might prove successful.

The tinderbox is dead, long live the tinderbox

I announced it last November and now it became reality: the Tinderbox is no more, in hardware as well as software. Excelsior was taken out of the Hurricane Electric facility in Fremont this past Monday, just before I left for SCALE13x.

Originally the box was hosted by my then-employer, but as of last year, to allow more people to have access to is working, I had it moved to my own rented cabinet, at a figure of $600/month. Not chump change, but it was okay for a while; unfortunately the cost sharing option that was supposed to happen did not happen, and about an year later those $7200 do not feel like a good choice, and this is without delving into the whole insulting behavior of a fellow developer.

Right now the server is lying on the floor of an office in the Mountain View campus of my (current) employer. The future of the hardware is uncertain right now, but it’s more likely than not going to be donated to Gentoo Foundation (minus the HDDs for obvious opsec). I’m likely going to rent a dedicated server of my own for development and testing, as even though they would be less powerful than Excelsior, they would be massively cheaper at €40/month.

The question becomes what we want to do with the idea of a tinderbox — it seems like after I announced the demise people would get together to fix it once and for all, but four months later there is nothing to show that. After speaking with other developers at SCaLE, and realizing I’m probably the only one with enough domain knowledge of the problems I tackled, at this point, I decided it’s time for me to stop running a tinderbox and instead design one.

I’m going to write a few more blog posts to get into the nitty-gritty details of what I plan on doing, but I would like to provide at least a high-level idea of what I’m going to change drastically in the next iteration.

The first difference will be the target execution environment. When I wrote the tinderbox analysis scripts I designed them to run in a mostly sealed system. Because the tinderbox was running at someone else’s cabinet, within its management network, I decided I would not provide any direct access to either the tinderbox container nor the app that would mangle that data. This is why the storage for both the metadata and the logs was Amazon: pushing the data out was easy and did not require me to give access to the system to anyone else.

In the new design this will not be important — not only because it’ll be designed to push the data directly into Bugzilla, but more importantly because I’m not going to run a tinderbox in such an environment. Well, admittedly I’m just not going to run a tinderbox ever again, and will just build the code to do so, but the whole point is that I won’t keep that restriction on to begin with.

And since the data store now is only temporary, I don’t think it’s worth over-optimizing for performance. While I originally considered and dropped the option of storing the logs in PostgreSQL for performance reasons, now this is unlikely to be a problem. Even if the queries would take seconds, it’s not like this is going to be a deal breaker for an app with a single user. Even more importantly, the time taken to create the bug on the Bugzilla side is likely going to overshadow any database inefficiency.

The part that I’ve still got some doubts about is how to push the data from the tinderbox instance to the collector (which may or may not be the webapp that opens the bugs too.) Right now the tinderbox does some analysis through bashrc, leaving warnings in the log — the log is then sent to the collector through -chewing gum and saliva- tar and netcat (yes, really) to maintain one single piece of metadata: the filename.

I would like to be able to collect some metadata on the tinderbox side (namely, emerge --info, which before was cached manually) and send it down to the collector. But adding this much logic is tricky, as the tinderbox should still operate with most of the operating system busted. My original napkin plan involved having the agent written in Go, using Apache Thrift to communicate to the main app, probably written in Django or similar.

The reason why I’m saying that Go would be a good fit is because of one piece of its design I do not like (in the general use case) at all: the static compilation. A Go binary will not break during a system upgrade of any runtime, because it has no runtime; which is in my opinion a bad idea for a piece of desktop or server software, but it’s a godsend in this particular environment.

But the reason for which I was considering Thrift was I didn’t want to look into XML-RPC or JSON-RPC. But then again, Bugzilla supports only those two, and my main concern (the size of the log files) would still be a problem when attaching them to Bugzilla just as much. Since Thrift would require me to package it for Gentoo (seems like nobody did yet), while JSON-RPC is already supported in Go, I think it might be a better idea to stick with the JSON. Unfortunately Go does not support UTF-7 which would make escaping binary data much easier.

Now what remains a problem is filing the bug and attaching the log to Bugzilla. If I were to write that part of the app in Python, it would be just a matter of using the pybugz libraries to handle it. But with JSON-RPC it should be fairly easy to implement support for it from scratch (unlike XML-RPC) so maybe it’s worth just doing the whole thing in Go, and reduce the proliferation of languages in use for such a project.

Python will remain in use for the tinderbox runner. Actually if anything I would like to remove the bash wrapper I’ve written and do the generation and selection of which packages to build in Python. It would also be nice if it could handle the USE mangling by itself, but that’s difficult due to the sad conflicting requirements of the tree.

But this is enough details for the moment; I’ll go back to thinking the implementation through and add more details about that as I get to them.

The end of an era, the end of the tinderbox

I’m partly sad, but for the most part this is a weight that goes away from my shoulders, so I can’t say I’m not at least in part joyful of it, even though the context in which this is happening is not exactly what I expected.

I turned off the Gentoo tinderbox, never to come back. The S3 storage of logs is still running, but I’ve asked Ian to see if he can attach everything at his pace, so I can turn off the account and be done with it.

Why did this happen? Well, it’s a long story. I already stopped running it for a few months because I got tired of Mike behaving like a child, like I already reported in 2012 by closing my bugs because the logs are linked (from S3) rather than attached. I already made my position clear that it’s a silly distinction as the logs will not disappear in the middle of nowhere (indeed I’ll keep the S3 bucket for them running until they are all attached to Bugzilla), but as he keeps insisting that it’s “trivial” to change the behaviour of the whole pipeline, I decided to give up.

Yes, it’s only one developer, and yes, lots of other developers took my side (thanks guys!), but it’s still aggravating to have somebody who can do whatever he likes without reporting to anybody, ignoring Council resolutions, QA (when I was the lead) and essentially using Gentoo as his personal playground. And the fact that only two people (Michał and Julian) have been pushing for a proper resolution is a bit disappointing.

I know it might feel like I’m taking my toys and going home — well, that’s what I’m doing. The tinderbox has been draining on my time (little) and my money (quite more), but those I was willing to part with — draining my motivation due to assholes in the project was not in the plans.

In the past six years that I’ve been working on this particular project, things evolved:

  • Originally, it was a simple chroot with a looping emerge, inspected with grep and Emacs, running on my desktop and intended to catch --as-needed failures. It went through lots of disks, and got me off XFS for good due to kernel panics.
  • It was moved to LXC, which is why the package entered the Gentoo tree, together with the OpenRC support and the first few crude hacks.
  • When I started spendig time in Los Angeles for a customer, Yamato under my desk got replaced with Excelsior which was crowdfounded and hosted, for two years straight, by my customer at the time.
  • This is where the rewrite happened, from attaching logs (which I could earlier do with more or less ease, thanks to NFS) to store them away and linking instead. This had to do mostly with the ability to remote-manage the tinderbox.
  • This year, since I no longer work for the company in Los Angeles, and instead I work in Dublin for a completely different company, I decided Excelsior was better off on a personal space, and rented a full 42 unit cabinet with Hurricane Electric in Fremont, where the server is still running as I type this.

You can see that it’s not that ’m trying to avoid spending time to engineer solutions. It’s just that I feel that what Mike is asking is unreasonable, and the way he’s asking it makes it unbearable. Especially when he feigns to care about my expenses — as I noted in the previously linked post, S3 is dirty cheap, and indeed it now comes down to $1/month given to Amazon for the logs storage and access, compared to $600/month to rent the cabinet at Hurricane.

Yes, it’s true that the server is not doing only tinderboxing – it also is running some fate instances, and I have been using it as a development server for my own projects, mostly open-source ones – but that’s the original use for it, and if it wasn’t for it I wouldn’t be paying so much to rent a cabinet, I’d be renting a single dedicated server off, say, Hetzner.

So here we go, the end of the era of my tinderbox. Patrick and Michael are still continuing their efforts so it’s not like Gentoo is left without integration test, but I’m afraid it’ll be harder for at least some of the maintainers who leveraged the tinderbox heavily in the past. My contract with Hurricane expires in April; at that point I’ll get the hardware out of the cabinet, and will decide what to do with it — it’s possible I’ll donate the server (minus harddrives) to Gentoo Foundation or someone else who can use it.

My involvement in Gentoo might also suffer from this; I hopefully will be dropping one of the servers I maintain off the net pretty soon, which will be one less system to build packages for, but I still have a few to take care of. For the moment I’m taking a break: I’ll soon send an email that it’s open season on my packages; I locked my bugzilla account already to avoid providing harsher responses in the bug linked at the top of this post.

GSoC Proposal: a better log collector and analyzer

In my previous post I didn’t add much about Gentoo ideas for GSoC and I didn’t really volunteer on anything. Well, this changes now as I have a suggestion and I’m even ready to propose myself as a mentor — with the caveat that as every other year (literally) my time schedule is not clear, so I will need a backup co-mentor.

You might or might not remember, but for my tinderbox I’m using a funky log collection and analysis toolset. Said toolset, although completely lacking chewing gum, is not exactly an example of good engineering, but it’s really just a bunch of hacks that happen to work. On the other hand, since doing Gentoo tinderboxing has never been part of my job, I haven’t had the motivation to rewrite it to something decent.

While the (often proposed) “continuous integration solution” for Gentoo is a task that is in my opinion unsuitable for a GSoC student – which should be attested by the fact that we don’t have one yet! – I think that at least the collection-and-analysis part should be feasible. I’ll try to list and explain here the current design and how I’d like it to work, so if somebody feels like working on this, they can already look into what there is to do.

Yes, I know that Zorry and others involved in Gentoo Hardened have been working on something along these line for a while — I still haven’t seen results, so I’m going to ignore it altogether.

Right now what happens is that we have four actors (in sense of computer/systems) involved: the tinderbox itself, a collector, a storage system, and finally, a frontend.

Between the tinderbox and the collector, the only protocol is tar-over-tcp, thanks to, well, tar and netcat on one side, and Ruby on the other. Indeed, my tinderbox script sends (encapsulated in tar) every completed log to the collector, which then extracts the log, and parses it.

The collector does most of the heavy lifting right now: it gets the package name and the timestamp of the log from the name of the file in the tar archive, then the maintainers (to assign the bug) from the log itself. It scans for particular lines (Portage and custom warning, among others), and creates a record with all of that together, to be sent to Amazon’s SimpleDB for querying. The log file itself is converted to HTML, split line by line, so that it can be seen with a browser without downloading, and saved to Amazon’s S3 once again.

The frontend fetches the records from SimpleDB where at least one warning is to be found, and display them in a table, with a link to see the log, and one to open a bug (thanks to pre-filled templates). It’s implemented in Sinatra as it is, and it’s definitely simplistic.

Now there are quite a number of weak spots in this whole setup. The most obvious is the reliance on Amazon. It’s not just an ethical or theoretical question, but the fact that S3 makes you pay for per-access is the reason why the list of logs is not visible to the public right now (it would easily add up to costs quickly).

What I would like from a GSoC project would be a replacement for the whole thing in which there can be a single entity that covers the collector, storage and frontend, without relying on Amazon at all. Without going all out on features that are difficult to manage, you need to find a way to store big binary data (the logs themselves can easily become many gigabyte in size), and then have a standard database with the list of entries like I have now. Technically, it would be possible to keep using S3 for logs, but I’m not sure how much of a good idea that would be at this point.

*Of course, one of the reasons why the collector and the frontend are split in my current setup, is that I thought that network connectivity between the tinderbox and the Internet couldn’t be entirely guaranteed; on the other hand, while the connection between the tinderbox and the collector is guaranteed (they are on the same physical host), the collector might not have an Internet connection to push to S3/SimpleDB, so…*

On the frontend side, a new frontend would have a better integration with Bugzilla, for instance it would be nice if I could just open the log and have a top frame (or div, I don’t care how it’s implemented) showing me a quick list of bugs open for the package, so I no longer have to search for it on Bugzilla to see if the problem has been reported already or not. Would also be nice to be able to attach the log to the newly open bugs, but that’s something that I care for relatively, if nothing else because some logs are so big, that to attach them you’d have to compress them with xz, and even then they might not work.

But possibly the most important part of this would be to change the way the tinderbox connects to the collector. Instead of keeping tar and netcat, I would like for Portage to implement a “client” by itself. This would make it easier to deploy the collector for uses different from tinderboxing (such as an organization-wide deployment), and at the same time would allow (time permitting) expansion on what is actually sent. For instance right now I have to gather and append the autoconf, automake and similar error logs to the main build log, to make sure that we can read it when we check the build log on the bug… if Portage was able to submit the logs by itself, it would submit the failure logs as well as config.log and possibly other output file.

Final note: no you can’t store the logs in an SQL database. But if you’re going to take part in GSoC and want to help with the tinderbox, this would be what I (as the tinderbox runner) need the most, so … think about it! I can provide a few more insights and the why of them (for instance why I think this really got to be written in Python, and why it has to support IPv6), if there is the interest.

New tinderbox tasks!

So from time to time I like to write down what my tinderbox is doing and this sounds like the perfect time to do so.

First of all, GCC 4.7 is getting unmasked soonish — the number of new bugs in the tracker is basically zero (I opened one today for a newly bumped package), and since at least a little bit of newly fixed bugs appear every other week, it should soon be feasible to just use GCC 4.7.

But the important part is the new tasks; both the ~amd64 and the stable/hardened amd64 tinderboxes are currently running aimed at two different packages’ reverse dependencies.

The former is running against the new, multilib-capable media-libs/freetype package, which Michał introduced recently — this version is able to build the 32-bit together with the 64-bit one, making the binary emul-linux package unnecessary for binary, 32-bit software requiring it. Finally being able to configure which 32-bit packages to install is going to be a win especially for systems with reduced storage space. Unfortunately, Freetype has a design that I would call naïve not to fall into foul language, and its headers are actually architecture-dependent; this means that the 64-bit and the 32-bit builds have different header files. Can you tell how bad that is?

While the proper solution would be to fix the design so that it does not have different headers at all, what we can do in the mean time is installing both sets in different paths, so that they do not collide with each other and software does not build against the wrong version. Unfortunately, since this means moving away even the native ABI headers, it means that you need to tell every single package where to find them. This should be solved by using pkg-config but unfortunately a number of packages out there don’t do that, especially non-autotools based ones — CMake packages seem to use pkg-config … but then they expect to guess the position of the headers instead of actually accepting what pkg-config says.

So I’m now building the reverse dependencies (at least those that the tinderbox can build, some of the games have interactive properties, because they require a CD to be mounted, and sometimes the dependency chain breaks too soon), and I’m going to report them in the tracker bug so that they can be tackled.

The second task is a bit more important from a point of view (even though it’s easier to not trigger): sys-libs/ncurses gained recently an USE flag tinfo that makes it build a separate library for some of the symbols that were, before, collapsed into libncurses alone. Unfortunately this means that now the symbols are defined in libtinfo and nobody is looking for it (or very few). The solution is, for this as well, to rely on what pkg-config tells you, but most packages do not use it for ncurses, mostly for compatibility with very old ncurses versions as far as I can tell.

It wouldn’t be half-bad if the same guy who ignore my bug reports, and who unmask glibc versions that are known to break half the tree didn’t decide to introduce said USE flag into stable already, knowing full well that most of the tree hasn’t been tested in that particular configuration. And here’s where my tinderbox enter the scenes, as it’s currently checking the stable tree to make sure I can report all packages breaking when that USE flag is enabled.

So the builds are running and the bugs are flowing — please just remember that the first primary resource that the tinderbox consumes is my time.

Restarting a tinderbox

So after my post about glibc 2.17 we got the ebuild in tree, and I’m now re-calibrating the ~amd64 tinderbox to use it. This sounds like an easy task but it really isn’t so. The main problem is that with the new C library you want to make sure to start afresh: no pre-compiled dependencies should be in, or they won’t be found: you want the highest coverage as possible, and that takes some work.

So how do you re-calibrate the tinderbox? First off you stop the build, and then you have to clean it up. The cleanup sometimes is as easy as emerge --depclean — but in some cases, like this time, the Ruby packages’ dependencies are causing a bit of a stir, so I had to remove them altogether with qlist -I dev-ruby virtual/ruby dev-lang/ruby | xargs emerge -C after which the depclean command actually starts working.

Of course it’s not a two minutes command like on any other system, especially when going through the “Checking for lib consumers” step — the tinderbox has a 181G of data in its partition (a good deal of which is old logs which I should actually delete at this point — and no that won’t delete the logs in the reported bugs, as those are stored on s3!), without counting the distfiles (which are shared with its host).

In this situation, if there were automagic dependencies on system/world packages, it would actually bail out and I’d have to go manually clean them up. Luckily for me, there’s no problem today, but I have had this kind of problem before. This is actually one of the reasons why I want to keep the world set in the tinderbox as small as possible — right now it consists basically of: portage-utils, gentoolkit (for revdep-rebuild), java-dep-check, Python 2.7 (it’s an old thing, it might be droppable now, not sure), and netcat6 for sending the logs back to the analysis script. I would have liked to remove netcat6 from the list but last time the busybox nc implementation didn’t work as expected with IPv6.

The unmerge step should be straightforward, but unfortunately it seems to be causing more grief than it’s expected, in many cases. What happens is that Portage has special handling for symlinked directories — and after we migrated to use /run instead of /var/run all the packages that have not been migrated to not using keepdir on it, ebuild-side, will spend much more time at unmerge stage to make sure nothing gets broken. This is why we have a tracker bug and I’ve been reporting ebuilds creating the directory, rather than just packages that do not re-create it on the init script. Also, this is when I thank I decided to get rid of XFS as the file deletion there was just way too slow.

Even though Portage takes care of verifying the link-time dependencies, I’ve noticed that sometimes things are broken nonetheless, so depending on what one’s target is, it might be a good idea to just run revdep-rebuild to make sure that the system is consistent. In this case I’m not going to waste the time, as I’ll be rebuilding the whole system in the next step, after glibc gets updated. This way we’re sure that we’re running with a stable base. If packages are broken at this level, we’re in quite the pinch, but it’s not a huge deal.

Even though I’m keeping my world file to the minimum, the world and system set is quite huge, when you add up all the dependencies. The main reason is that the tinderbox enables lots and lots of flags – as I want to test most code – so things like gtk is brought in (by GCC, nonetheless), and the cascade effect can be quite nasty. The system rebuild can easily take a day or two. Thankfully, the design of the tinderbox scripts make it so that the logs are send through the bashrc file, and not through the tinderbox harness itself, which means that even if I get failures at this stage, I’ll get a log for them in the usual place.

After this is completed, it’s finally possible to resume the tinderbox building, and hopefully then some things will work more as intended — like for instance I might be able to get a PHP to work again… and I’ll probably change the tinderbox harness to try building things without USE=doc, if they fail, as too many packages right now fail with it enabled or, as Michael Mol pointed out, because there are circular dependencies.

So expect me working on the tinderbox for the next couple of days, and then start reporting bugs against glibc-2.17, the tracker for which I opened already, even though it’s empty at the time of writing.

GLIBC 2.17: what’s going to be a trouble?

So LWN reports just today on the release of GLIBC 2.17 which solves a security issue and looks like was released mostly to support the new AArch64 architecture – i.e. arm64 – but the last entry in the reported news is possibly going to be a major headache and I’d better post already about it so that we have a reference for it.

I’m referring to this:

The `clock_*' suite of functions (declared in <time.h>) is now available directly in the main C library. Previously it was necessary to link with -lrt to use these functions. This change has the effect that a single-threaded program that uses a function such as `clock_gettime' (and is not linked with -lrt) will no longer implicitly load the pthreads library at runtime and so will not suffer the overheads associated with multi-thread support in other code such as the C++ runtime library.

This is in my opinion the most important change, not only because, as it’s pointed out, C++ software would have quite an improvement not to link to the pthreads library, but also because it’s the only change listed there that I can foresee trouble with already. And why is that? Well, that’s easy. Most of the software out there will do something along these lines to see what library to link to when using clock_gettime (the -lrt option was not always a good idea because it’s not existing for most other operating systems out there, including FreeBSD and Mac OS X).

AC_SEARCH_LIB([clock_gettime], [rt])

This is good, because it’ll try either librt, or just without any library at all (“none required”) which means that it’ll work on both old GLIBC systems, new GLIBC systems, FreeBSD, and OS X — there is something else on Solaris if I’m not mistaken, which can be added up there, but I honestly forgot its name. Unfortunately, this can easily end up with more trouble when software is underlinked.

With the old GLIBC, it was possible to link software with just librt and have them use the threading functions. Once librt will be dropped automatically by the configuration, threading libraries will no longer be brought in by it, and it might break quite a few packages. Of course, most of these would already have been failing with gold but as you remembered, I wasn’t able to get to the whole tree with it, and I haven’t set up a tinderbox for it again yet (I should, but it’s trouble enough with two!).

What about --as-needed in this picture? A full hard-on implementation would fail on the underlinking, where pthreads should have been linked explicitly, but would also make sure to not link librt when it’s not needed, which would make it possible to improve the performance of the code (by skipping over pthreads) even when the configure scripts are not written properly (like for instance if they are using AC_CHECK_LIB instead of AC_SEARCH_LIB). But since it’s not the linkage of librt that causes the performance issue, but rather the one for pthreads, it actually works out quite well, even if some packages might keep an extra linkage to librt which is not used.

There is a final note that I need o write about and honestly worries me quite a bit more than all those above. The librt library has not been dropped — only the clock functions have been moved over to the main C library, but the library keeps asynchronous and list-based I/O operation interfaces (AIO and LIO), the POSIX message queues interfaces, the shared memory interfaces, and the timer interfaces. This means that if you’re relying on a clock_gettime test to bring in librt, you’ll end up with a failing package. Luckily for me, I’ve avoided that situation already on feng (which uses the message queues interface) but as I said I foresee trouble at least for some packages.

Well, I guess I’ll just have to wait for the ebuild for 2.17 to be in the tree, and run a new tinderbox from scratch… we’ll see what gets us there!

Tinderbox and expenses

I’ve promised some insight into how much running the tinderbox actually costed me. And since today marks two months from Google AdSense’s crazy blacklisting of my website, I guess it’s a good a time as any other.

SO let’s start with the obvious first expense: the hardware itself. My original Tinderbox was running on the box I called Yamato, which costed me some €1700 and change, without the harddrives, this was back in 2008 — and about half the cost was paid with donation from users. Over time, Yamato had to have its disks replaced a couple of times (and sometimes the cost came out of donations). That computer has been used for other purposes, including as my primary desktop for a long time, so I can’t really complain about the parts that I had to pay myself. Other devices, and connectivity, and all those things, ended up being shared between my tinderbox efforts and my freelancing job, so I also don’t complain about those in the least.

The new Tinderbox host is Excelsior, which has been bought with the Pledgie which got me paying only some $1200 of my pocket, the rest coming in from the contributors. The space, power and bandwidth, have been offered by my employer which solved quite a few problems. Since now I don’t have t pay for the power, and last time I went back to Italy (in June) I turned off, and got rid of, most of my hardware (the router was already having some trouble; Yamato’s motherboard was having trouble anyway, I saved the harddrive to decide what to do, and sold the NAS to a friend of mine), I can assess how much I was spending on the power bill for that answer.

My usual power bill was somewhere around €270 — which obviously includes all the usual house power consumption as well as my hardware and, due to the way the power is billed in Italy, an advance on the next bill. The bill for the months between July and September, the first one where I was fully out of my house, was for -€67 and no, it’s not a typo, it was a negative bill! Calculator at hand, he actual difference between between the previous bills and the new is around €50 month — assuming that only a third of that was about the tinderbox hardware, that makes it around €17 per month spent on the power bill. It’s not much but it adds up. Connectivity — that’s hard to assess, so I’d rather not even go there.

With the current setup, there is of course one expense that wasn’t there before: AWS. The logs that the tinderbox generates are stored on S3, since they need to be accessible, and they are lots. And one of the reasons why Mike is behaving like a child about me just linking the build logs instead of attaching them, is that he expects me to delete them because they are too expensive to keep indefinitely. So, how much does the S3 storage cost me? Right now, it costs me a whopping $0.90 a month. Yes you got it right, it costs me less than one dollar a month for all the storage. I guess the reason is because they are not stored for high reliability or high speed access, and they are highly compressible (even though they are not compressed by default).

You can probably guess at this point that I’m not going to clear out the logs from AWS for a very long time at this point. Although I would like for some logs not to be so big for nothing — like the sdlmame one that used to use the -v switch to GCC which causes all the calls to print a long bunch of internal data that is rarely useful on a default log output.

Luckily for me (and for the users relying on the tinderbox output!) those expenses are well covered with the Flattr revenue from my blog’s posts — and thank to Socialvest I no longer have to have doubts on whether I should keep the money or use it to flattr others — I currently have over €100 ready for the next six/seven months worth of flattrs) Before this, between my freelancer’s jobs, Flattr, and the ads on the blog, I would also be able to cover at least the cost of the server (and barely the cost of the domains — but that’s partly my fault for having.. a number).

Unfortunately, as I said at the top of the post, there no longer are ads served by Google on my blog. Why? Well, a month and a half ago I received a complain from Google, saying that one post of mine in which I namechecked a famous adult website, in the context of a (at the time) recent perceived security issue, is adult material, and that it goes against the AdSense policies to have ads served on a website with adult content. I would still argue that just namechecking a website shouldn’t be considered adult content, but while I did submit an appeal to Google, a month and a half later I have no response at hand. They didn’t blacklist the whole domain though, they only blacklisted my blog, so the ads are still showed on Autotools Mythbuster (which I count to resume working almost full time pretty soon) but the result is bleak: I went down from €12-€16 a month to a low €2 a month due to this, and that is no longer able to cover for the serve expense by itself.

This does not mean that anything will change in the future, immediate or not. This blog for me has more value than the money that I can get back from it, as it’s a way for me to showcase my ability and, to a point, get employment — but you can understand that it still upsets me a liiiittle bit the way they handled that particular issue.

Tinderbox and manual intervention

So after my descriptive post you might be wondering what’s so complex or time-requiring in running a tinderbox. That’s because I haven’t spoken about the actual manual labor that goes into handling the tinderbox.

The major work is of course scouring the logs to make sure that I file only valid bugs (and often enough that’s not enough, as things hide behind the surface), but there are a quite a number of tasks that are not related to the bug filing, at least not directly.

First of all, there is the matter of making sure that the packages are available for installation. This used to be more complex, but luckily thanks to REQUIRED_USE and USE deps, this task is slightly easier than before. The tinderbox.py script (that generates the list of visible packages that need to be tested) also generates a list of use conflicts, dependencies etc. This list I have to look at manually, and then update the package.use file so that they are satisfied. If their dependencies or REQUIRED_USE are not satisfied, the package is not visible, which means it won’t be tested.

This sounds extremely easy, but there are quite a few situations, which I discussed previously where there is no real way to satisfy requirements for all the packages in the tree. In particular there are situations where you can’t enable the same USE flag all over the tree — for instance if you do enable icu for libxml2, you can’t enable it for qt-webkit (well, you can but you have to disable gstreamer then, which is required by other packages). Handling all the conflicting requirements takes a bit of trial and error.

Then there is a much worse problem and that is with tests that can get stuck, so that things like this happen:

localhost ~ # qlop -c
 * dev-python/mpi4py-1.3
     started: Sat Nov  3 12:29:39 2012
     elapsed: 9 hours, 11 minutes, 12 seconds

And I’ve got to keep growing the list of packages whose tests are unreliable — I wonder if the maintainers ever try running their tests, sometimes.

This task used to be easier because the tinderbox supports sending out tweets or dents through bti so that it would tell me what was its action — unfortunately identi.ca kept marking the tinderbox’s account as spam, and while they did unlock it three times, it meant I had to ask support to do so every other week. I grew tired of that and stopped caring about it. Unfortunately that means I have to connect to the instance(s) from time to time to make sure they are still crunching.

How to run a tinderbox with my scripts

Hello there everybody, today’s episode is dedicated to set up a tinderbox instance like mine which is building and installing every visible package in the tree, running its tests and so on.

So first step is to have a system where to run the tinderbox. A virtual system is much preferred, since the tinderbox can easily install very insecure code, although nothing prevents you from running it straight on the metal. My choice for this, after Tiziano pointed me in that direction, was to get LXC to handle this, as a chroot on steroids (the original implementation used chroot and was much less reliable).

Now there are a number of degrees you could be running the tinderbox at; most of the basics are designed to work with almost every package in the system broken — there are only a few packages that are needed for this system to work, here’s my world file on the two tinderboxes:

app-misc/screen
app-portage/gentoolkit
app-portage/portage-utils
dev-java/java-dep-check
dev-lang/python:2.7
net-analyzer/netcat6
net-misc/curl

But let’s do stuff in order. What do I do when I run the tinderbox? I connect on SSH over IPv6 – the tinderbox has very limited Internet connectivity, as everything is proxied by a Squid instance, like I described in this two years old post – directly as root unfortunately (but only with key auth). Then I either start or reconnect to a screen instance, which is where the tinderbox is running (or will be running).

The tinderbox’s scripts are on git and are written partially by me and partially by Zac (following my harassment for the most part, and he’s done a terrific job). The key script is tinderbox-continuous.sh which is simply going to keep executing, either ad-infinitum, or going through a file given as parameter, the tinderbox on 200 packages at a time (this way there is emerge --sync from time to time so that the tree doesn’t get stale). There is also a fetch-reverse-deps.sh which is used to, as the cover says, fetch the reverse dependencies of a given package, which pairs with the continuous script above when I do a targeted run.

On the configuration side, /etc/portage/make.conf has to refer to /root/flameeyes-tinderbox/tinderbox.make.conf which comes from the repository and sets up features, verbosity levels, and the fetch/resume commands to use curl.. these are also set up so that if there is a TINDERBOX_PROXY environment variable set, then it’ll go through it. Setting of TINDERBOX_PROXY and a couple more variables is done in /etc/portage/make.tinderbox.private.conf; you can use it for setting GENTOO_MIRRORS with something that is easily and quickly reachable, as there’s a lot to download!

But what does this get us? Just a bunch of files in /var/log/portage/build. How do I analyze them? Originally I did this by using grep within Emacs and looked at it file by file. Since I was opening the bugs with Firefox running on the same system, so I could very easily attach the logs. This is no longer possible, so that’s why I wrote a log collector which is also available and that is designed in two components: a script that receives (over IPv6 only, and within the virtual network of the host) the log being sent with netcat and tar, removes colour escape sequences, and writes it down as an HTML file (in a way that Chrome does not explode on) on Amazon’s S3, also counting how many of the observed warnings are found, and whether the build, or tests, failed — this data is saved over SimpleDB.

Then there is a simple sinatra-based interface that can be ran on any computer, and I run it locally on my laptop, and fetches the data from SimpleDB, and displays it in a table with links to the build logs. This also has a link to the pre-filled bug template (it uses a local file where emerge --info is saved as comment #0.

Okay so this is the general gist of it, if I have some more time this weekend I’ll draw some cute diagram for it, and you can all tell me that it’s overcomplicated and that if I did it in $whatever it would have been much easier, but at the same time you’ll not be providing any replacement, or if you will start working on it, you’ll spend months designing the schema of the database, with a target of next year, which will not be met. I’ve been there.