It’s not all gold that shines — Why underlinking is a bad thing

I have written a few days ago a rant about the gold linker, that really messed up the 64-bit tinderbox to the point I have to recreate it from scratch entirely. Some people asked me why I was interested in using gold, given I’m the first one who was sceptic about it — but not the only one as it happens.

Well, there are two reasons for me to be interested in this; the most obvious one is that I’m being paid to care (but I won’t/can’t go in further details, let’s just say that I have a contract that requires me to work at close contact with it), the other is that gold implements what I’ll be calling “underlinking protection” similar to what is done by the Apple linker, and similar to what would be achieved with --no-copy-dt-needed-entries on the good old ld (but in the case of ld, the softer --as-needed is actually going to make it quite moot.

Let me explain what the issue is here with a simple, real-world example to be found in git (this is one of the patches I worked on to get gold running). Since I’m really getting quite used to draw UML diagrams lately, I’m going to start with a diagram of the current situation. Please do note that I’m actually simplifying a lot since there are a number of components in GIT, of course, and also both libcrypto and libssl implement a number of interfaces. Instead, I have reduced the whole situation at a total of three interfaces, two libraries and one component. But it should do.

Example package diagram for GIT and OpenSSL

Yes I like pastel colours on diagrams.

So in this diagram we got two real packages (git and OpenSSL), that for what we’re concerned install respectively one application (imap-send) and two libraries (libssl and libcrypto). Of the two libraries, the former exports only SSL functions, while the latter exports HMAC methods and the EVP interface (which is a higher-level interface for multiple crypto and hash algorithms) — before somebody comments on this, yes I know they both export a lot more than that, I sincerely don’t care right now, I’m simplifying enormously to make this bearable.

As a specific note, I’m going to define the <<access>> and <<import>> terms in this way:

  • <<access>> means the first package is requiring symbols from the second, so it uses the second; this dependency is finalised during compile phase;
  • <<import>> means the first package includes a NEEDED entry for the second, so it links to the second, and thus it is finalised during link phase.

Now that we have an idea of the situation, let’s analyse the dependencies between the parties:

  • first of all we can rule out problems between the two libraries: the SSL functions access the EVP interface, but then again, libssl imports libcrypto, solving the dependency correctly;
  • then there is the dependency between imap-send and libssl: since it access the SSL interface, the component imports libssl, and that’s also correct for the dependencies;
  • on the other hand, while imap-send also accesses HMAC and EVP interfaces, it doesn’t import the libcrypto library; here is our problem.

In the case of the Apple linker and gold, this is an error condition: you cannot make use of the transitive import to produce a stable binary, so the link is rejected; you have to balance the two, by importing (linking in) both libssl and libcrypto. To be honest, in a situation like this, though, there is little to gain beside properness by restoring the balance — after all, the two libraries always come together and are linked one with the other.

There are, though, a number of situations where this does really matter a lot. Dependencies finalised during the compile phase cannot be swapped out at runtime if the binary interface changes, which is why we have the link phase and why we have sonames to express the binary interface compatibility. After seeing a real-world example, let’s go back to an abstract example.

Package diagram of the first abstract example

In this situation we have some given component (can be either an executable, another library, a plugin, it doesn’t really matter to what I’m going to talk about), that accesses interfaces from two different libraries: libfoo, and a dependency of that, libbar, but only links to the former — this is the minimal example of the GIT/OpenSSL interaction above. Just like in the previous real-world case, this situation, as is, is stable: since libfoo.so.1 links to libbar.so.1, the interfaces that the component access are to be found.

Now let’s assume that libbar gets updated to a new version, which is no longer ABI compatible (libbar.so.2); for the sake of argument, libfoo.so.1 still works, either because it was designed to support both source interfaces, or because only ABI changed, but API didn’t (for instance, when 32-bit parameters are replaced with 64-bit ones, you break ABI, but not API), or because there is some source-level compatibility interface. Or, to put it very very bluntly, you only need to rebuild the packages for libbar.so.2 to be usedreminds you of something?

Package diagram of the second abstract example

If you were to run revdep-rebuild (or any other similar approach), it would only rebuild libfoo.so.1 to link to the new dependency; at that point, you’d have libfoo.so.1 updated to both access the new interface and import the new library, but the original component would still be accessing the interfaces from libbar.so.1!

At this point, it depends on how libbar is designed what would happen; if the library was designed with symbol-versioning in mind from the start, or if it avoid re-using the same names between functions with the same ABI, then the original component would fail to load (or execute) because of missing symbols – of course with due caveats – but if they didn’t ensure this, and actually two interfaces with different ABIs shared the same name, you’d get the same effect of a symbol collision: crash, corruption, or generally unpredictable results.

Keeping all I explained in this post in mind, I’m pretty sure you can see the advantage in gold not allowing underlinking to begin with, and why I’m so scared of the softer --as-needed approach. Even if we’re not going to allow gold to be used as default linker anytime soon, we definitely have a number of good reasons why we should at least start using it to run the tinderbox — after all, even if it’s just triggered by gold, fixing underlinking will make it nicer for good old ld users.

Tomorrow, I’ll probably post something about the technical issues regarding testing gold, since there are at least a few.