GLIBC 2.17: what’s going to be a trouble?

So LWN reports just today on the release of GLIBC 2.17 which solves a security issue and looks like was released mostly to support the new AArch64 architecture – i.e. arm64 – but the last entry in the reported news is possibly going to be a major headache and I’d better post already about it so that we have a reference for it.

I’m referring to this:

The `clock_*' suite of functions (declared in <time.h>) is now available directly in the main C library. Previously it was necessary to link with -lrt to use these functions. This change has the effect that a single-threaded program that uses a function such as `clock_gettime' (and is not linked with -lrt) will no longer implicitly load the pthreads library at runtime and so will not suffer the overheads associated with multi-thread support in other code such as the C++ runtime library.

This is in my opinion the most important change, not only because, as it’s pointed out, C++ software would have quite an improvement not to link to the pthreads library, but also because it’s the only change listed there that I can foresee trouble with already. And why is that? Well, that’s easy. Most of the software out there will do something along these lines to see what library to link to when using clock_gettime (the -lrt option was not always a good idea because it’s not existing for most other operating systems out there, including FreeBSD and Mac OS X).

AC_SEARCH_LIB([clock_gettime], [rt])

This is good, because it’ll try either librt, or just without any library at all (“none required”) which means that it’ll work on both old GLIBC systems, new GLIBC systems, FreeBSD, and OS X — there is something else on Solaris if I’m not mistaken, which can be added up there, but I honestly forgot its name. Unfortunately, this can easily end up with more trouble when software is underlinked.

With the old GLIBC, it was possible to link software with just librt and have them use the threading functions. Once librt will be dropped automatically by the configuration, threading libraries will no longer be brought in by it, and it might break quite a few packages. Of course, most of these would already have been failing with gold but as you remembered, I wasn’t able to get to the whole tree with it, and I haven’t set up a tinderbox for it again yet (I should, but it’s trouble enough with two!).

What about --as-needed in this picture? A full hard-on implementation would fail on the underlinking, where pthreads should have been linked explicitly, but would also make sure to not link librt when it’s not needed, which would make it possible to improve the performance of the code (by skipping over pthreads) even when the configure scripts are not written properly (like for instance if they are using AC_CHECK_LIB instead of AC_SEARCH_LIB). But since it’s not the linkage of librt that causes the performance issue, but rather the one for pthreads, it actually works out quite well, even if some packages might keep an extra linkage to librt which is not used.

There is a final note that I need o write about and honestly worries me quite a bit more than all those above. The librt library has not been dropped — only the clock functions have been moved over to the main C library, but the library keeps asynchronous and list-based I/O operation interfaces (AIO and LIO), the POSIX message queues interfaces, the shared memory interfaces, and the timer interfaces. This means that if you’re relying on a clock_gettime test to bring in librt, you’ll end up with a failing package. Luckily for me, I’ve avoided that situation already on feng (which uses the message queues interface) but as I said I foresee trouble at least for some packages.

Well, I guess I’ll just have to wait for the ebuild for 2.17 to be in the tree, and run a new tinderbox from scratch… we’ll see what gets us there!

It’s not all gold that shines — Why underlinking is a bad thing

I have written a few days ago a rant about the gold linker, that really messed up the 64-bit tinderbox to the point I have to recreate it from scratch entirely. Some people asked me why I was interested in using gold, given I’m the first one who was sceptic about it — but not the only one as it happens.

Well, there are two reasons for me to be interested in this; the most obvious one is that I’m being paid to care (but I won’t/can’t go in further details, let’s just say that I have a contract that requires me to work at close contact with it), the other is that gold implements what I’ll be calling “underlinking protection” similar to what is done by the Apple linker, and similar to what would be achieved with --no-copy-dt-needed-entries on the good old ld (but in the case of ld, the softer --as-needed is actually going to make it quite moot.

Let me explain what the issue is here with a simple, real-world example to be found in git (this is one of the patches I worked on to get gold running). Since I’m really getting quite used to draw UML diagrams lately, I’m going to start with a diagram of the current situation. Please do note that I’m actually simplifying a lot since there are a number of components in GIT, of course, and also both libcrypto and libssl implement a number of interfaces. Instead, I have reduced the whole situation at a total of three interfaces, two libraries and one component. But it should do.

Example package diagram for GIT and OpenSSL

Yes I like pastel colours on diagrams.

So in this diagram we got two real packages (git and OpenSSL), that for what we’re concerned install respectively one application (imap-send) and two libraries (libssl and libcrypto). Of the two libraries, the former exports only SSL functions, while the latter exports HMAC methods and the EVP interface (which is a higher-level interface for multiple crypto and hash algorithms) — before somebody comments on this, yes I know they both export a lot more than that, I sincerely don’t care right now, I’m simplifying enormously to make this bearable.

As a specific note, I’m going to define the <<access>> and <<import>> terms in this way:

  • <<access>> means the first package is requiring symbols from the second, so it uses the second; this dependency is finalised during compile phase;
  • <<import>> means the first package includes a NEEDED entry for the second, so it links to the second, and thus it is finalised during link phase.

Now that we have an idea of the situation, let’s analyse the dependencies between the parties:

  • first of all we can rule out problems between the two libraries: the SSL functions access the EVP interface, but then again, libssl imports libcrypto, solving the dependency correctly;
  • then there is the dependency between imap-send and libssl: since it access the SSL interface, the component imports libssl, and that’s also correct for the dependencies;
  • on the other hand, while imap-send also accesses HMAC and EVP interfaces, it doesn’t import the libcrypto library; here is our problem.

In the case of the Apple linker and gold, this is an error condition: you cannot make use of the transitive import to produce a stable binary, so the link is rejected; you have to balance the two, by importing (linking in) both libssl and libcrypto. To be honest, in a situation like this, though, there is little to gain beside properness by restoring the balance — after all, the two libraries always come together and are linked one with the other.

There are, though, a number of situations where this does really matter a lot. Dependencies finalised during the compile phase cannot be swapped out at runtime if the binary interface changes, which is why we have the link phase and why we have sonames to express the binary interface compatibility. After seeing a real-world example, let’s go back to an abstract example.

Package diagram of the first abstract example

In this situation we have some given component (can be either an executable, another library, a plugin, it doesn’t really matter to what I’m going to talk about), that accesses interfaces from two different libraries: libfoo, and a dependency of that, libbar, but only links to the former — this is the minimal example of the GIT/OpenSSL interaction above. Just like in the previous real-world case, this situation, as is, is stable: since libfoo.so.1 links to libbar.so.1, the interfaces that the component access are to be found.

Now let’s assume that libbar gets updated to a new version, which is no longer ABI compatible (libbar.so.2); for the sake of argument, libfoo.so.1 still works, either because it was designed to support both source interfaces, or because only ABI changed, but API didn’t (for instance, when 32-bit parameters are replaced with 64-bit ones, you break ABI, but not API), or because there is some source-level compatibility interface. Or, to put it very very bluntly, you only need to rebuild the packages for libbar.so.2 to be usedreminds you of something?

Package diagram of the second abstract example

If you were to run revdep-rebuild (or any other similar approach), it would only rebuild libfoo.so.1 to link to the new dependency; at that point, you’d have libfoo.so.1 updated to both access the new interface and import the new library, but the original component would still be accessing the interfaces from libbar.so.1!

At this point, it depends on how libbar is designed what would happen; if the library was designed with symbol-versioning in mind from the start, or if it avoid re-using the same names between functions with the same ABI, then the original component would fail to load (or execute) because of missing symbols – of course with due caveats – but if they didn’t ensure this, and actually two interfaces with different ABIs shared the same name, you’d get the same effect of a symbol collision: crash, corruption, or generally unpredictable results.

Keeping all I explained in this post in mind, I’m pretty sure you can see the advantage in gold not allowing underlinking to begin with, and why I’m so scared of the softer --as-needed approach. Even if we’re not going to allow gold to be used as default linker anytime soon, we definitely have a number of good reasons why we should at least start using it to run the tinderbox — after all, even if it’s just triggered by gold, fixing underlinking will make it nicer for good old ld users.

Tomorrow, I’ll probably post something about the technical issues regarding testing gold, since there are at least a few.