On stabling hardware-dependent packages

I thought I wrote about this before but it looks like I didn’t. And since this was something I was asked about a few months ago as well, any time is good to fix the issue by writing what I think about this.

We could say that almost every package in the tree relies on someone having at least some pieces of hardware: a computer. At the same time, some packages require more particular hardware than others: drivers, firmware, and similar packages. The problem with these packages is that, unless you own the hardware, you have no way to know that they work at all. Sure you can be certain that they do not work, if they fail to build, but the opposite can’t be confirmed without the hardware.

This is troublesome to say the least, as sometimes a given hardware driver’s ebuild is written, by either a developer or an user (that goes on to proxy it), when they have the hardware … but it goes untouched once said hardware is gone for good. This happened to me before, for instance I no longer have a Motorola phone to test those handlers; nor I have an HP printer any longer, so there goes my help to the hplip package…

At the same time it is true I have quite a few hardware-dependent packages in the tree still: ekeyd, iwl6000-ucode, and so on so forth. One of the packages I still try to help out with even though I had to discard the related hardware (for now at least) is freeipmi, which exemplifies all too well the problems with hardware-dependent packages.

FreeIPMI, like the name leaves to imply, is an implementation of utilities and daemons to manage IPMI hardware, which is used on middle-range-to-high-end server motherboard for remote management. I had one on Yamato until recently, when either the mainboard or the PSU started to act up and I had to take it out. I had tried to making the best out of the ebuild since I had access to an IPMI board, and that consisted mostly on bumping it, fixing the build and, recently, updating its init scripts so that they actually work (in the case of the watchdog service I ended up patching it up upstream, which meant having something that actually works as intended!).

Last year, after about two years from the release of the version that was marked stable, I decided to ask for a new stable with a version of the package that surely worked better … which is not the same to say that it worked perfectly. Indeed, now I know that said version simply did not work in some configurations, at all, because the bmc-watchdog init script, as I said, did not work and required upstream changes.

What happens when I ask for a new stable this year, to cover for that issue? Well, this time around the arch teams finally decided to test the stuff they mark stable, but that also meant nobody was available to test IPMI-related packages on x86. Even our own infra team was unable to test it anywhere else beside amd64, with the end result that after months of back-and-forth, I was able to get the old package de-keyworded, which means that there is no longer a “stable” freeipmi package; you’ve got to use ~arch.

Don’t get me wrong: I’m not happy about the conclusion, but it’s better than pretending that an ancient version works.

So when Vincent – who proxy maintains the acr38u IFD driver for pcsc-lite – asks me about marking his ebuild stable, I plan on writing this very post.. and then leave it there for over five months… I apologize, Vincent, and I don’t think I have enough words to express my trouble with a delay of such proportions.

And when I get Paweł’s automated stable request for another pcsc-lite driver, my answer is an obvious one: does anybody have the hardware to test the package? I’m afraid the answer was obviously no… unless Alon still has said reader. End result is the same though: time to de-keyword the package so that we can drop the ancient ebuild, which is definitely not adhering to modern standards (well okay, -r2 isn’t much better, but I digress).

Of course the question here would be “how do you mark stable any pcsc-lite related software at all this way?” … and I don’t really have a real clear answer to that. I guess we’re lucky that the ccid driver covers so many hardware devices that it makes it much more likely that somebody has a compatible reader and some card to test it with… the card part is easy, as I suppose most people living in the so-called Western world have an ATM or credit card… and those have a chip that can be at least detected, if not accessed, by PC/SC.

There is actually a script written in Python that allows you to access at least some of the details on EMV based cards.. the dependencies should all be in Portage, but I didn’t have time to play with the code for long enough to make sure it works and it is safe to use.

There is another obvious question of course: “why don’t you stable your own stuff?” — while this could be sensible, there is a trick: my “desktop” systems – Yamato, Titan (this box) and Saladin (the laptop) – are all running ~arch for as long as I can remember… or at least part of it; I gave up on glibc-2.14 so it is masked on all of my systems, on count of breaking Ruby and I’m still pissed by the fact it was unmasked even if that was known.

Any comment to help picking up a direction about this kind of trouble?

Testing stable; stable testing

It might not be obvious but my tinderbox is testing the “unstable” ~x86 keyword of Gentoo. This choice was originally due to the kind of testing (the latest and greatest version of packages for which we don’t know the averse effects of), and I think it has helped tremendously, up to now, to catch things that could have otherwise bit people months, if not years later, especially for the least-used packages. Unfortunately that decision also meant ignoring for the most part the other, now more common architecture (amd64 — although I wonder if the new, experimental amd32 architecture would take its place), as well as ignoring the stable branch of Gentoo, which is luckily tracked by other people from time to time.

Lacking continuous testing though, what is actually considered stable, sometimes it’s not very much so. This problem is further increased by the fact that sometimes the stable requests aren’t proper by themselves: it can be an user asking to stable a package, and the arch teams being called before they should have, it might be an overseen issue that the maintainer didn’t think of, or it might simply be that multiple maintainers had different feelings about stabilisation, which happens pretty often in team-maintained packages.

Whatever the case, once a stable request has been filed, it is quite rare that issues are brought up that are severe enough the stable is denied: it might be that the current stable is in a much worse shape, or maybe it’s a security stable request, or a required dependency of something following one of those destinies. I find this a bit unfortunate, I’m always happy when issues are brought up that delay a stable, even if that sometimes means having to skip a whole generation of a package, just as long as they mean having a nicer stable package.

Testing a package is obviously a pretty taxing task: we have numbers of combination of USE flags, compiler and linker flags, features, and so on so forth. For some packages, such as PHP, testing every and each combination of USE flags is not only infeasible, but just impossible within this universe’s time. to make the work of the arch team developers more bearable, years ago, starting with the AMD64 team, the concept of Arch Testers was invented: so-called “power users” whose task is to test the package and give green light for stable-marking.

At first, all of the ATs were as technically capable as a full-fledged developer, to the point that most of the original ones (me included) “graduated” to full devship. With time this requirement seems to have relaxed, probably because AMD64 became usable to the average user, and no longer just to those with enough knowledge to workaround the landmines originally present when trying to use a 64-bit system as a desktop — I still remember seeing Firefox fail to build because of integer and pointer variable swaps, once that became an error-worthy mistake in GCC 3.4. Unfortunately this also meant that a number of non-apparent issues became even less apparent.

Most of these issues often end up being caused by one simple fault: lack of direction in testing. For most packages, and that includes, unfortunately, a lot of my packages, the ebuild’s selftests are nowhere near comprehensive enough to tell whether the package works or not, and unless the tester actually uses the package, there is little chance that he knows really how to test it. Sure that covers most of the common desktop packages and a huge number of server packages, but that’s far from the perfect solution since it’s the less-common packages that require more eyes on the issues.

What the problem is, with most other software, is that it might require specific hardware, software or configuration to actually be tested. And most of the time, it requires a few procedures to be applied to ensure that it is actually tested properly. At the same time, I only know of Java and Emacs team publishing proper testing procedures for at least a subset of their packages. Most packages could use such a documentation, but it’s not something that maintainers, me included, are used to work on. I think that one of the most important tasks here, is to remind developers when asking stable to come up with a testing plan. Maybe after asking a few times, we’ll get to work and write up that documentation, once and for all.

Stable users’ libpng update

Seems like my previous post didn’t make enough of a fuss to get other developers to work on feasible solutions to avoid the problem to hit stable users… and now we’re back to square one for stable users.

Since I also stumbled across two problems today while updating my stable chroots and containers, that represent the local install of remote vservers, and a couple of testing environments for my work, I guess it’s worth writing of a couple of tricks you might want to know before proceeding.

Supposedly, you should be able to properly complete the update without running the libpng-1.4.x-update.sh hack! (and this is important because that hack will create a number of problems on the longer run, so please try to avoid it!). If you have been using --as-needed for a decent amount of time, the update should come almost painless. Almost.

I maintained that revdep-rebuild should be enough to take care of the update for you, but it comes with a few tricks here that make it slightly more complex. First of all, the libpng-1.4 package will try to “preserve” the old library by copying it inside itself, avoiding dynamic link breakage. This supposedly makes for a better user experience as you won’t hit packages that fail to start up for missing libraries, but has two effects; one is that you may be running a program with both libpng objects around, which is not safe; the second is that revdep-rebuild will not pick up the broken binaries at all, this way.

Additionally, there is a slot of the package that will bring in only the library itself, so that binary-only packages linked to the old libpng can be used still; if you have packages such as Opera installed, you might have this package brought in on your system; this will further complicate matters because it will then collide with libpng-1.4… bad thing.

These are my suggested instructions:

  • get a console login, make sure that GNOME, KDE, any other graphical interface is not running; this is particularly important because you might otherwise experience applications that crash mid-runtime;
  • emerge -C =libpng-1.2* make sure that you don’t have the old library around; this works for both the old complete package and for the new library-only binary compatibility package;
  • rm -f /usr/lib/libpng12.so* (replace lib/ with lib64/ on x86-64 hosts; this way you won’t have the old libraries around at all; actually this should be a no-op since you removed it, but this way you ensure you don’t have them around if you had already updated;
  • emerge -1 =libpng-1.4* installs libpng-1.4 without preserving the libraries above; if you had already updated, please do this anyway, this way you’ll make sure it registers the lack of the preserved libraries;
  • revdep-rebuild -- --keep-going it shouldn’t stop anywhere now, but since it might, it’s still a good idea to let it build as much as it can.

Make also sure you follow my suggestion of running lafilefixer incrementally after every merge, that way you won’t risk too much breakage when the .la files gets dropped (which I hope we’ll start doing systematically soon), by adding this snipped to your /etc/portage/bashrc:

post_src_install() {
    lafilefixer "${D}"

Important, if you’re using binary packages! Make sure to re-build =libpng-1.4* after you deleted the file, if you had updated it before; otherwise the package will have preserved the files, and will pack it up in the tbz2 file, reinstalling it every time you merged the binary.

This post brought to you by the guy who has been working the past four years to make sure that this problem is reduced to manageable size, and that has been attacked, defamed, insulted and so on so forth for expecting other developers to spend more time testing their packages. If you find this useful, you might want to consider thanking him somehow…

Testing environments

I don’t feel too well, I guess the anger caused by the whole situation, coupled with lots of work to do (including accounting, as it’s that time of the year, for the first time in my case), and a personal emotional situation that went definitely haywire. I’m trying to write this while working on some other things, and eating, and so on so forth, so it’ll might not be too coherent in itself.

In yesterday’s post I pointed to a post by Ryan regarding testsuites, and the lack of consistent handling of testsuites when making changes. While it is true that there are a lot of ways for test failures to go undetected, I think there are some more subtle problems with a few of the testsuites I encountered in the tinderbox project.

One of these problems I already noted yesterday and it’s the lack of a testsuite from upstream. This involves all kind of projects, final user utilities, libraries (C, Ruby, Python, Perl), and daemons. For some of those, the problem is not as much as there is no testsuite, but rather that the testsuite doesn’t get released together with the code, for some reasons (most of which end up being that the testsuite outweighs the code itself many times), and that it’s not as easy to track down where the suite is. For Ruby packages, more than a few times we end up having to download the code from GitHub rather than using the gem, for instance (luckily, this is almost easy for us to do, but I’ll try not to digress further).

Some tests also depend on specific hardware or software components, and those are probably the ones that give the worst headaches to developers. For what concerns hardware, well, it’s tough luck, you either have the hardware or don’t (there is one more facet regarding the fact that you might have the access but you might not be able to access it but let’s not dig into that). The fun start when you have dependencies on some particular software component. This does not mean depending on libraries or tools, those are given and cannot be solved in any other way beside actually adding the dependencies, but rather depending on services and daemons being running.

Let’s take for instance the testsuite for dev-ruby/pg, that is the PostgreSQL bindings extension for Ruby. Since you have to test that the bindings work, you need to be able to access PostgreSQL; obviously you shouldn’t be running this against a production PostgreSQL server, as that might be quite nasty (think if the tests actually went to access or delete your data). For this reason, the 0.8 series of the package does not have any testsuite (bad!). This was solved in the new 0.9 series, as upstream added the support to launch a private, local copy of PostgreSQL to test with. This actually adds another problem but I’ll go back to that later on.

But if database server related problems are quite obvious (and thus why things like ActiveRecord only have tests running with SQLite3 that does not need any service running), there are worse situations. For instance, what about all the software communicating through DBus? The software assumes being able to talk with the system instance of D-Bus to work, but what if you’re going to test disruptive methods and there is a local, working, installed copy of the same software? In general you don’t want for tested software to interact with the software running on the system. On the other hand, there are a number of packages that fails their tests if DBus is not running, or in the case of sbcl if there is no syslog listening to /dev/log. These will also create quite a stir, as you might guess.

Now, earlier I said that the new support for launching a local instance of PostgreSQL in the pg 0.9 series creates one further problem; that problem is that it now adds one limitation on the enviornment: you have to be able to start PostgreSQL from the testsuite; what’s the problem with that? Well, to be able to run the PostgreSQL commands you need to drop privileges to non-root, so if you run the testsuite as root you’ll fail… and while Portage does allow to run tests as non-root, I’m afraid it’s still defaulting to root (FEATURES=userpriv is the one that controls the behaviour). And even if the default was changed, there are other tests that only work as root or even some, like libarchive’s, that run slightly different tests depending on which users you’re running them as. If you run them as root, they’ll ensure, for instance, that the preservation of users and permissions work; if you run them as non-root, that you cannot write as a different user or cannot restore the permissions.

You can probably start to see what the problem is with tests: they are not easy; getting them right is difficult, and most often than not, the upstream tests only work in particular environmental conditions that we cannot reproduce properly. And a failure in the testsuites is probably one of the most common showstopper for a stable request (this is important when the older version worked properly, while the new one fails, as regressions in stable have a huge marginal cost!).

You should refuse stable

While Donnie thinks about improving Gentoo management, I already said that I’m going to keep myself to the technical side of the fence, working on improving Gentoo’s technical quality, which I find somehow lacking, and not because of a lack of management. Maybe just for the other way around, there are too many people trying to get the management part working and they fail to see that there is a dire need for technical skills work.

Today I started (not to willingly to be honest) the day with a mail from Alexander E. Patrakov who CCed me on a Debian bug about GCC 4.4 miscompilation; while Ryan is the one who has been working on GCC 4.4, I guessed I could do what Alexander suggested, since I got the tinderbox set up.

To do this I simply set up one more check in my bashrc’s src_unpack hook, and used the tinderboxing script that Zac provided me with to run ebuild/unpack phase for all the latest versions of all packages. Now, besides the fact that this has the nice side effect of downloading the sources even of the stuff that was missing up to now, I found some things that I really wouldn’t have expected to.

Like calling econf (and thus ./configure during unpack phase). Which is bad in so many ways, because it disallows to fiddle with the configure in hook, and in my case wastes time when doing just a search. What worries me, though, is not this mistake, but rather the fact that one of the two ebuilds with it I found up to now went stable a few days ago!

Now, I can understand that arch teams are probably swamped with requests, but it would be nice if such obvious mistakes in ebuild were spotted before the stuff goes stable. For instance, I like the way Ferris always nitpicks on the ebuilds, including the test suites, since it actually allows to spot things that might have escaped the maintainer, who’s likely used to the thing, or has a setup where the thing works already. I don’t care if it stops my stable requests for a few days or even months, but if there is a problem I missed, I like to know that beforehand.

So please, you should refuse stable if something is not right with an ebuild, and even if it’s not a regression, for trivial stuff like that you should refuse it without hesitation.

My thoughts on stable markings

Sounds like today I have to do another post talking about facts currently in discussion, although I usually try to avoid this kind of posts, focusing more on technicalities, or discussions about Freedom, or in general talking of things being done rather than being discussed.

So, there’s a lot of fuzz about stable markings lately, especially since after x86 arch team was created also x86 stable markings take their time, while before they were done when the developers felt that way.

This of course slowed down the x86 stable tree, but at the same time now we have a stable tree that is almost really “stable”. Unfortunately, although it started more quickly, now the stable marking rate is slowing down, not only for x86 but also for amd64.

The reasons? Too many people not caring for the stable tree and deciding to use ~arch directly, developers who don’t have stable chroots, and so it’s a domino effect: the less the stable tree is used, the less the stable markings can be done.

Myself, I actually started having a stable chroot at the time, but then I had to drop it, for the simple reason that I didn’t have enough time to keep it updated as my whole system.

So, although I don’t pretend to know an answer to this, I have a suggestion: ATs, when you become devs (because most of you will become), don’t drop the stable systems, don’t drop stable chroots, but rather help stable marking stuff. For AMD64 in particular I see a really bad trend :“( the stable markings are almost getting an halt lately… while last year it was the first arch marking stable.

Myself, I’ve decided that if I’m able to upgrade the CPU of this box (to an Athlon 64 x2 4600+, that has the not-so-high price of €245, plus €89 for a 1GB ram stick), or rather if I’m able to get another job to pay for that, as I’m going to upgrade the CPU for sure if I get enough money, I’ll be maintaining an amd64 stable chroot once again, so that at least the software I maintain usually I’ll be able to test and mark stable when everyone else can’t.

Anyway, I hope this is the last post I do following a discussion in gentoo-dev, as it’s also boring for me, as well as for who don’t read gentoo-dev :P