Gold readiness obstacle #2: base versioning

This is the second post in the series analysing the obstacles we face if we want to actually make use of gold as system link editor at some point in time in the future. As I said for the previous one, please make your interest in the topic explicit, as it is a draining exercise to me, due to huge lack of interest by many other developers.

I have already noted up in part #1 that I have submitted a patch for gold and it wasn’t merged, which ticked me off a bit. In this post I’ll explain what that patch was about. This is particularly interesting to me, because, while it is in a very commonly used package, this problem wouldn’t be an “obstacle” as much as it is, in my view, if it wasn’t that I was doing paid work to look into it.

I have already written yesterday, and a number of times before, how you use ELF symbol versioning, so I won’t go back to the topic right now. What I’ll repeat here is that there are two main reasons to use symbol versioning: preventing symbol collisions, as it is used by the Berkeley DB slots I wrote about yesterday, or preserving binary compatibility when making incompatible change to functions’ ABIs but wanting to keep the same library ABI (and thus, soname).

For the former task, you can use the same blanked version information for all the symbols, as I noted, while for the latter you need a more surgical approach. What you usually do, when you stabilized the interface the first time around, is marking with the same version string all the functions. When one of those functions need to be replaced, then, you use source-level symbol versioning to provide a new “default” version of the symbol, together with an explicitly-versioned copy of the symbol that abides to the previously-used ABI. For more details about this, you can see the Binutils documentation that shows the example I’m going to pick up here.

     __asm__(".symver original_foo,foo@");
     __asm__(".symver old_foo,foo@VERS_1.1");
     __asm__(".symver old_foo1,foo@VERS_1.2");
     __asm__(".symver new_foo,foo@@VERS_2.0");

The code above is taken verbatim from the latest version of GNU ld (bfd) documentation. What it translates to, is this:

  • the original, replaced/deprecated interface of the foo() function is implemented with the (hidden) symbol original_foo;
  • two further, replaced/deprecated versions of foo() are implemented as old_foo and old_foo1;
  • finally, new_foo implements the most current version of foo().

How does this work in practice? Well, first of all the headers should only declare the newest interface of foo() – that is new_foo – so that new programs only can use that. When linking a new binary, the link editor will know to satisfy foo() references lacking a version with that version, not because the version string is “higher” (the version string has no meaning for link editor and runtime loaders, it’s just a string); but because it is marked as the “default” version (see the double at symbol in the directive. The other interfaces don’t have to be in the headers, and they will be ignored by the link editor, like they weren’t there. Software built against a previous version of the library, where the default version for foo() was VERS_1.2 or VERS_1.1, would still reference those versions; the runtime loader (ld.so) would then look those up, rather than VERS_2.0.

Lovely, isn’t it? You can improve your interface, solve age-old issues, without having to break the ABI, with the sole “little” downside of increasing the size of the library itself… and relying on a feature only available, for what I know, on GLIBC and maybe FreeBSD (you can achieve the same effect on Windows, but their approach is massively different, anyway let’s ignore that for now). Before somebody says that you actually double the size of the code, I’d like to point out that most of the time, the old function can be expressed as a call to the new function, with properly adapted parameters, unless you’re really changing the function to something entirely different.

For those wondering, using this approach with C++ is very complex and I’d probably say impossible: the ABI for C++ libraries includes the vtable for classes; when adding a new function to a class you change the vtable, increasing its size, and causing the ABI to change. It is for this reason that Trolltech used D-pointers in Qt for a long time, and why KDE had many problems introducing new features and fixing old bugs within a major release cycle.

Now let’s go back to our story of gold, fuse, and symbols.

The fuse library is designed to keep as binary compatible as possible with its predecessors, at least when built for GLIBC (it has special rules to not version interfaces when building for uClibc for instance). This is because it is designed to allow proprietary filesystem providers — for instance the Mac version is used by Parallels to provide their shared folders support. Unfortunately it seems like this wasn’t a requirement in their original implementation, which was built wihtout version information for symbols. This happens quite often actually.

The Binutils example code above fortunately shows exactly how to deal with that: you declare a symbol with no version information. This is called the “base version”, and can only be referenced as the sole version in a linker script, or by omitting the version specifier in a .symver directive. This works with the GNU assembler (as) and with the BFD link editor, but when creating a library with a base-versioned symbols with gold, you get an error:

libtool: link: i686-pc-linux-gnu-gcc -shared  .libs/fuse.o .libs/fuse_kern_chan.o .libs/fuse_loop.o .libs/fuse_loop_mt.o .libs/fuse_lowlevel.o .libs/fuse_mt.o .libs/fuse_opt.o .libs/fuse_session.o .libs/fuse_signals.o .libs/cuse_lowlevel.o .libs/helper.o .libs/subdir.o .libs/iconv.o .libs/mount.o .libs/mount_util.o   -lrt -ldl -Wl,--as-needed  -pthread -Wl,--version-script -Wl,./fuse_versionscript -Wl,-O1 -Wl,--hash-style=gnu   -pthread -Wl,-soname -Wl,libfuse.so.2 -o .libs/libfuse.so.2.8.5
/usr/lib/gcc/i686-pc-linux-gnu/4.6.0/../../../../i686-pc-linux-gnu/bin/ld: error: symbol __fuse_exited has undefined version 

It might not be as quick to be said but this message simply means that gold does not support linking objects containing base-versioned symbols. Is it just a missing feature? Not really. I mean, the feature itself is missing, and indeed is simple to implement, to the point I have implemented it, and you can find the patch for it in Sourceware bug #12261 which is still pending.

The problem here is that even though GNU bfd/ld implements that feature, it is a pointless feature to implement, right now. The problem lies not in the link editor but in the runtime loader (ld.so). As you can see from the testcases provided by Ian in the bug linked above, GLIBC does not do what it is expected to.

What you expect is that, when the loader finds an undefined (requested) symbol, without an attached version information, it would look for a symbol with the corresponding name with base version (no version attached to the definition), and failing that it would look for the one in the default version. What actually happens is that the loader simply picks the first symbol it finds with the same name, without caring about the version if it wasn’t specified in the customer. It is just sheer luck if it finds the one that was intended to be found.

What’s the morale? Well, we have one advertised feature that never worked but that a few projects, such as fuse, wanted to rely upon… I don’t disagree with Ian that this should be fixed in GLIBC first, and that for now gold is just exposing code that doesn’t work. Unfortunately Ian’s requests about the feature went unanswered – and due to Drepper just dumping the list of bug numbers without description in the NEWS files I can’t tell if it was addressed in the new 2.14 version – which means we still have no clue whether this is a functionality that will ever be useful or not. I’ll have to try again if fuse project would agree at just dumping the symbols for now, since they cannot be useful with current glibc versions.

Again, expressing your interest on the topic helps me judge how much weight to put on it outside of my dayjob. Thanks in advance.

10 thoughts on “Gold readiness obstacle #2: base versioning

  1. A head’s up: GCC LTO has a bug in which it can compile the function and the symver asm statement in separate threads, resulting in wrong symbols being exported (apparently it does not see into asm statements to see the dependency).http://gcc.gnu.org/bugzilla…Mozilla folks trying to build Firefox with LTO (with clang and with GCC) have encountered this: http://blog.mozilla.com/res

    Like

  2. Thanks for the heads’ up on the LTO bug; I haven’t even considered yet looking into LTO, but it’s good to know beforehand!

    Like

  3. Well, I’m interested in almost everything you describe in detail. Very informative to learn about problems and not only the how-it-should-be from the manuals!Concerning gold for Gentoo: I don’t know if it will bring significant improvements for the system. If yes great, if not, then I would say that your precious time is already occupied with other important topics in gentoo nobody else wants to take care off.

    Like

  4. I’ve tried to use gold a few times in the past, but it has shown itself too buggy each time :-(.I’d really love a linker that cuts down on the insane C++ link times, unfortunately last time I tried it wasn’t even really faster… (I think boost and KDE was what I tested, with a 4-core CPU there were some parts where linking took more time than everything else together).Though admittely my current solution is to just avoid compiling C++ programs.

    Like

  5. I’m always interested in all the weird stuff you work on, linking issues first!I’ve had a couple of head-banging problems working on grisbi (it has a brain-dead plugin system), but the best solution I could come up with was just to remove the plugin system. That solved linking issues on all 3 platforms. But that’s something I really didn’t like doing.Really, anything that helps spread the word on how plain C works is worth five stars in my book. Keep it up, I definitely learn something new everytime I read your posts.Cheers :)

    Like

  6. For the record, this is yet unfixed in glibc-2.14, which means that we have little chance that gold would gain the feature soon.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s