I don’t feel too well, I guess the anger caused by the whole situation, coupled with lots of work to do (including accounting, as it’s that time of the year, for the first time in my case), and a personal emotional situation that went definitely haywire. I’m trying to write this while working on some other things, and eating, and so on so forth, so it’ll might not be too coherent in itself.
In yesterday’s post I pointed to a post by Ryan regarding testsuites, and the lack of consistent handling of testsuites when making changes. While it is true that there are a lot of ways for test failures to go undetected, I think there are some more subtle problems with a few of the testsuites I encountered in the tinderbox project.
One of these problems I already noted yesterday and it’s the lack of a testsuite from upstream. This involves all kind of projects, final user utilities, libraries (C, Ruby, Python, Perl), and daemons. For some of those, the problem is not as much as there is no testsuite, but rather that the testsuite doesn’t get released together with the code, for some reasons (most of which end up being that the testsuite outweighs the code itself many times), and that it’s not as easy to track down where the suite is. For Ruby packages, more than a few times we end up having to download the code from GitHub rather than using the gem, for instance (luckily, this is almost easy for us to do, but I’ll try not to digress further).
Some tests also depend on specific hardware or software components, and those are probably the ones that give the worst headaches to developers. For what concerns hardware, well, it’s tough luck, you either have the hardware or don’t (there is one more facet regarding the fact that you might have the access but you might not be able to access it but let’s not dig into that). The fun start when you have dependencies on some particular software component. This does not mean depending on libraries or tools, those are given and cannot be solved in any other way beside actually adding the dependencies, but rather depending on services and daemons being running.
Let’s take for instance the testsuite for
dev-ruby/pg, that is the PostgreSQL bindings extension for Ruby. Since you have to test that the bindings work, you need to be able to access PostgreSQL; obviously you shouldn’t be running this against a production PostgreSQL server, as that might be quite nasty (think if the tests actually went to access or delete your data). For this reason, the 0.8 series of the package does not have any testsuite (bad!). This was solved in the new 0.9 series, as upstream added the support to launch a private, local copy of PostgreSQL to test with. This actually adds another problem but I’ll go back to that later on.
But if database server related problems are quite obvious (and thus why things like ActiveRecord only have tests running with SQLite3 that does not need any service running), there are worse situations. For instance, what about all the software communicating through DBus? The software assumes being able to talk with the system instance of D-Bus to work, but what if you’re going to test disruptive methods and there is a local, working, installed copy of the same software? In general you don’t want for tested software to interact with the software running on the system. On the other hand, there are a number of packages that fails their tests if DBus is not running, or in the case of sbcl if there is no syslog listening to
/dev/log. These will also create quite a stir, as you might guess.
Now, earlier I said that the new support for launching a local instance of PostgreSQL in the pg 0.9 series creates one further problem; that problem is that it now adds one limitation on the enviornment: you have to be able to start PostgreSQL from the testsuite; what’s the problem with that? Well, to be able to run the PostgreSQL commands you need to drop privileges to non-root, so if you run the testsuite as root you’ll fail… and while Portage does allow to run tests as non-root, I’m afraid it’s still defaulting to root (
FEATURES=userpriv is the one that controls the behaviour). And even if the default was changed, there are other tests that only work as root or even some, like
libarchive’s, that run slightly different tests depending on which users you’re running them as. If you run them as root, they’ll ensure, for instance, that the preservation of users and permissions work; if you run them as non-root, that you cannot write as a different user or cannot restore the permissions.
You can probably start to see what the problem is with tests: they are not easy; getting them right is difficult, and most often than not, the upstream tests only work in particular environmental conditions that we cannot reproduce properly. And a failure in the testsuites is probably one of the most common showstopper for a stable request (this is important when the older version worked properly, while the new one fails, as regressions in stable have a huge marginal cost!).
Take a break for a week or two, otherwise you’ll end up stabilizing broken packages and being audited for your tax stuff.
Speaking of which, are tests still failing in your tinderbox for ebuilds such as dev-python/formencode? I am yet to figure out whats wrong in your environment. (ref: https://bugs.gentoo.org/sho…I’d really like to help since there are other similar bugs, but aren’t sure how to move on.
You filed a tinderbox bug regarding buildbot’s test suite. I responded with a request for a version bump, but you should also know that I have moved away all of the tests in the new version, and I’m working on writing a completely new test suite.As far as tests requiring a specific environment — app-backup/amanda has this problem. Because of the way Amanda is designed, almost all of the tests *must* run on an installed copy (which of course precludes any sort of testing by Gentoo), and in fact will stomp on the configuration of such an installed copy. There’s no good way around that, sadly.However, I think we’ve done a good job of conditionalizing any external dependencies for the tests. They use a few CPAN modules, but gracefully degrade when those modules are not available. Where external resources are required (tape drive, Amazon S3 credentials, PostgreSQL server), they can be specified with environment variables, and if they are missing, the tests are simply skipped.It’s important to remember that tests serve a *lot* of purposes. Sure, you can’t run Amanda’s test suite from the ebuild, but we run it on every commit (including on two Gentoo systems) and even while coding. The tests have been a great boon to the project, even without being run downstream.