Gold readiness obstacle #1: Berkeley DB

I have already said that I’m working on getting gold tested a couple of years after its introduction. The situation with this linker is a bit difficult to assess; Google is set on making heavy use of it, and is supposedly faster to link Chrome (even though it uses an inconsiderate amount of RAM to achieve that — it’s the usual approach of Google software I guess: you can always throw more RAM, you can’t always throw more time!), among others. On the other hand I can tell for sure that no distribution tried to build their whole package set with it yet, simply by looking at the kind of packages that fail to build for one reason or another.

I’ll leave the failures that are important to other, non-Gentoo-based distributions for the next few posts; today, the target is a failure that limits itself to Gentoo systems, because it involves a workaround we implemented a long time ago, which is now going to bite our ass until we either solve it at once, or find an alternative workaround. But let’s start with the original problem.

The Berkeley DB library (berkdb) – which is now maintained by Oracle, for the record – is a very common library used for storing data in plain files. There are a number of different “generations” of API, one of which is provided by the FreeBSD C library as well (db1); and the very generic API structure (dbm) is also implemented by the GNU-project gdbm library. The use of BerkDB was much more prominent in day-to-day life a couple of years ago for any Linux user; nowadays, the storage format and library of preference is SQLite (to the point that even BerkDB itself provides an SQLite-based interface to its own storage format. But even so, it is very difficult to do without BerkDB: LibreOffice, Postfix, Evolution, Squid, Perl, … they all require BerkDB for this or that feature.

Unfortunately the most recent generation of APIs for Berkeley DB is still varying widely, and the format is not always compatible between minor version changes (so from 4.4 to 4.5, and so on). For these reasons, Gentoo has been allowing side-by-side installation of multiple Berkeley DB versions at the same time, so-called slotting. By allowing non-rebuilt software to still use the old version (and the old files), as well as allowing access to the utilities of the previous format, you make sysadmins’ work easier, usually. Unfortunately, since the functions present on more than one minor version have the same exact name, Gentoo users and developers ended up hitting ELF symbol collisions when programs and libraries linked different Berkeley DB versions.

Turns out that GLIBC is actually designed keeping this in mind, and includes symbol versioning to solve the issue: a particular string is assigned t each symbol, so that you can have multiple libraries providing ABI-incompatible symbols with the same name – usually there is a need for the API to be at least partially compatible, but I don’t want to go in too many details now – without clashes and collisions. To provide versioning you have three main option: inline with the C sources, through the use of a version script, or, with GNU ld/bfd, through the --default-symver option, which sets the version string of each symbol to the soname of the library it is exported from. This was a godsend for Gentoo at the time because it allowed avoiding collisions without having to edit anything in the build system: you just had to add the flag to the linker’s flags in the ebuild and voilà.

If you’re now wondering whether GNU gold supports this option, you’re on the right track. The answer is “no, not right now”, right now it chokes on such an option, which results in Berkeley DB reporting the compiler to be unable to create executables. Whether it will support said option or not in the future is still to be seen. Last time I tried to implement a bfd/ld feature in gold – namely support for emitting explicitly unversioned symbols, which is needed to build FUSE – the results have been disappointing although I understand there is a problem with implementing a build feature that cannot work at runtime right now.

So unless gold gains the same option, we need to find another solution or ignore the existence of gold for a while longer. An alternative that I have been told about already would be to replace the current --default-symver option with a --version-script option pointing to an explicit version script to set the version. Unfortunately, this is not as easy done as it is said, at least for the versions we have in tree right now. A similar blanket-version approach would make no issue if it was introduced with a new slot of the package, as the version would have to be different either way, but it wouldn’t work to keep binary compatibility with the older versions.

The problem is that BerkDB isn’t installing a single library, but a number of them instead; and since --default-symver uses the library’s soname when creating the versions for its symbols, it means that for each library, you’d need a different version. Implementing this same method through use of standard versioning scripts would be a world of pain, and probably not worth the prize. For now, I decided to simply mask BerkDB on the container that is testing gold, forcing as many packages as possible to use gdbm instead, which does not have the same problem.

I’m glad we decided not to go the same route with expat, even though the immediate fallout at the time was out of scale (at the time it was a dream even to think about using --as-needed.la files are a joke in comparison!), it saved us the headache of reaching the point where we decide whether to forgo modern tools, or break binary compatibility again.

At any rate this is just the tip of the iceberg, about gold and real-world software. I’ll write more about this in the next days as I find time. For now, I wouldn’t mind if you noted your interest on testing gold… comments, flattrs (on the blog, post or, even better, tinderbox since that’s what is doing the work!) and other tokens are definitely appreciated. At least it would tell me I’m not wrong in insisting spending time reporting and solving the gold bugs.

4 thoughts on “Gold readiness obstacle #1: Berkeley DB

  1. using different compilers often reveal errors in the code and create more portable projects.i believe the same applies to linkers so i would approve having another linker available for testing!keep up the good work

    Like

  2. I can’t say i’m terribly missing gold right now but overall I’d prefer using something faster as large C++ projects are extremely annoying to build time wise.

    Like

  3. I’m wondering if elfutils linker could be extended to be useful (at least amd64) so we could have a comparison

    Like

  4. Gentoo has been allowing side-by-side installation of multiple Berkeley DB versions at the same time, so-called slotting.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s