LibreSSL, OpenSSL, collisions and problems

Some time ago, on the gentoo-dev mailing list, there has been an interesting thread on the state of LibreSSL in Gentoo. In particular I repeated some of my previous concerns about ABI and API compatibility, especially when trying to keep both libraries on the same system.

While I hope that the problems I pointed out are well clear to the LibreSSL developers, I thought reiterating them again clearly in a blog post would give them a wider reach and thus hope that they can be addressed. Please feel free to reshare in response to people hand waving the idea that LibreSSL can be either a drop-in, or stand-aside replacement for OpenSSL.

Last year, when I first blogged about LibreSSL, I had to write a further clarification as my post was used to imply that you could just replace the OpenSSL binaries with LibreSSL and be done with it. This is not the case and I won’t even go back there. What I’m concerned about this time is whether you can install the two in the same system, and somehow decide which one you want to use on a per-package basis.

Let’s start with the first question: why would you want to do that? Everybody at this point knows that LibreSSL was forked from the OpenSSL code and started removing code that has been needed unnecessary or even dangerous – a very positive thing, given the amount of compatibility kludges around OpenSSL! – and as such it was a subset of the same interface as its parent, thus there would be no reason to wanting the two libraries on the same system.

But then again, LibreSSL never meant to be considered a drop-in replacement, so they haven’t cared as much for the evolution of OpenSSL, and just proceeded in their own direction; said direction included building a new library, libtls, that implements higher-level abstractions of TLS protocol. This vaguely matches the way NSS (the Netscape-now-Mozilla TLS library) is designed, and I think it makes sense: it reduces the amount of repetition that needs to be coded in multiple parts of the software stack to implement HTTPS for instance, reducing the chance of one of them making a stupid mistake.

Unfortunately, this library was originally tied firmly to LibreSSL and there was no way for it to be usable with OpenSSL — I think this has changed recently as a “portable” build of libtls should be available. Ironically, this wouldn’t have been a problem at all if it wasn’t that LibreSSL is not a superset of OpenSSL, as this is where the core of the issue lies.

By far, this is not the first time a problem like this happens in Open Source software communities: different people will want to implement the same concept in different ways. I like to describe this as software biodiversity and I find it generally a good thing. Having more people looking at the same concept from different angles can improve things substantially, especially in regard to finding safe implementations of network protocols.

But there is a problem when you apply parallel evolution to software: if you fork a project and then evolve it on your own agenda, but keep the same library names and a mostly compatible (thus conflicting) API/ABI, you’re going to make people suffer, whether they are developers, consumers, packagers or users.

LibreSSL, libav, Ghostscript, … there are plenty of examples. Since the features of the projects, their API and most definitely their ABIs are not the same, when you’re building a project on top of any of these (or their originators), you’ll end up at some point making a conscious decision on which one you want to rely on. Sometimes you can do that based only on your technical needs, but in most cases you end up with a compromise based on technical needs, licensing concerns and availability in the ecosystem.

These projects didn’t change the name of their libraries, that way they can be used as drop-rebuild replacement for consumers that keep to the greatest common divisor of the interface, but that also means you can’t easily install two of them in the same system. And since most distributions, with the exception of Gentoo, would not really provide the users with choice of multiple implementations, you end up with either a fractured ecosystem, or one that is very much non-diverse.

So if all distributions decide to standardize on one implementation, that’s what the developers will write for. And this is why OpenSSL will likely to stay the standard for a long while still. Of course in this case it’s not as bad as the situation with libav/ffmpeg, as the base featureset is going to be more or less the same, and the APIs that have been dropped up to now, such as the entropy-gathering daemon interface, have been considered A Bad Idea™ for a while, so there are not going to be OpenSSL-only source projects in the future.

What becomes an issue here is that software is built against OpenSSL right now, and you can’t really change this easily. I’ve been told before that this is not true, because OpenBSD switched, but there is a huge difference between all of the BSDs and your usual Linux distributions: the former have much more control on what they have to support.

In particular, the whole base system is released in a single scoop, and it generally includes all the binary packages you can possibly install. Very few third party software providers release binary packages for OpenBSD, and not many more do for NetBSD or FreeBSD. So as long as you either use the binaries provided by those projects or those built by you on the same system, switching the provider is fairly easy.

When you have to support third-party binaries, then you have a big problem, because a given binary may be built against one provider, but depend on a library that depends on the other. So unless you have full control of your system, with no binary packages at all, you’re going to have to provide the most likely provider — which right now is OpenSSL, for good or bad.

Gentoo Linux is, once again, in a more favourable position than many others. As long as you have a full source stack, you can easily choose your provider without considering its popularity. I have built similar stacks before, and my servers deploy stacks similarly, although I have not tried using LibreSSL for any of them yet. But on the desktop it might be trickier, especially if you want to do things like playing Steam games.

But here’s the harsh reality, even if you were to install the libraries in different directories, and you would provide a USE flag to choose between the two, it is not going to be easy to apply the right constraints between final executables and libraries all the way into the tree.

I’m not sure if I have an answer to balance the ability to just make the old software use the new library and the side-installation. I’m scared that a “solution” that can be found to solve this problem is bundling and you can probably figure out that doing so for software like OpenSSL or LibreSSL is a terrible idea, given how fast you should update in response to a security vulnerability.

LibreSSL and the bundled libs hurdle

It was over five years ago that I ranted about the bundling of libraries and what that means for vulnerabilities found in those libraries. The world has, since, not really listened. RubyGems still keep insisting that “vendoring” gems is good, Go explicitly didn’t implement a concept of shared libraries, and let’s not even talk about Docker or OSv and their absolutism in static linking and bundling of the whole operating system, essentially.

It should have been obvious how this can be a problem when Heartbleed came out, bundled copies of OpenSSL would have needed separate updates from the system libraries. I guess lots of enterprise users of such software were saved only by the fact that most of the bundlers ended up using older versions of OpenSSL where heartbeat was not implemented at all.

Now that we’re talking about replacing the OpenSSL libraries with those coming from a different project, we’re going to be hit by both edges of the proprietary software sword: bundling and ABI compatibility, which will make things really interesting for everybody.

If you’ve seen my (short, incomplete) list of RAND_egd() users which I posted yesterday. While the tinderbox from which I took this is out of date and needs cleaning, it is a good starting point to figure out the trends, and as somebody already picked up, the bundling is actually strong.

Software that bundled Curl, or even Python, but then relied on the system copy of OpenSSL, will now be looking for RAND_egd() and thus fail. You could be unbundling these libraries, and then use a proper, patched copy of Curl from the system, where the usage of RAND_egd() has been removed, but then again, this is what I’ve been advocating forever or so. With caveats, in the case of Curl.

But now if the use of RAND_egd() is actually coming from the proprietary bits themselves, you’re stuck and you can’t use the new library: you either need to keep around an old copy of OpenSSL (which may be buggy and expose even more vulnerability) or you need a shim library that only provides ABI compatibility against the new LibreSSL-provided library — I’m still not sure why this particular trick is not employed more often, when the changes to a library are only at the interface level but still implements the same functionality.

Now the good news is that from the list that I produced, at least the egd functions never seemed to be popular among proprietary developers. This is expected as egd was vastly a way to implement the /dev/random semantics for non-Linux systems, while the proprietary software that we deal with, at least in the Linux world, can just accept the existence of the devices themselves. So the only problems have to do with unbundling (or replacing) Curl and possibly the Python SSL module. Doing so is not obvious though, as I see from the list that there are at least two copies of which is the older ABI for Curl — although admittedly one is from the scratchbox SDKs which could just as easily be replaced with something less hacky.

Anyway, my current task is to clean up the tinderbox so that it’s in a working state, after which I plan to do a full build of all the reverse dependencies on OpenSSL, it’s very possible that there are more entries that should be in the list, since it was built with USE=gnutls globally to test for GnuTLS 3.0 when it came out.

Autotools Mythbuster: what’s new in Automake 1.14

So the new version of Automake is out, and is most likely going to be the last release of the first major version. The next version of Automake is going to be 2.0, not to be confused with Automake NG which is a parallel project, still maintained by Stefano, but with a slightly different target.

After the various issues in 1.13 series Stefano decided to take a much more conservative approach for both 1.14 and the upcoming 2.0. While a bunch of features are getting deprecated with these two versions, they will not be dropped at least until version 3.0 I suppose. This mean that there should be all the time for developers to update their Autotools before they starting failing. Users of -Werror for Automake will of course still see issues, but I’ve already written about it so I’m not going back on the topic.

There are no big deals with the new release, by the way, as its topic seems to be mostly “get things straight”. For instance, the C compilation handling has been streamlined, with anticipation of further streamlining on Automake 2.0. In particular, the next major release will get rid of the subdir-objects option… by force-enabling it, which also means that the connected, optional AM_PROG_CC_C_O is now bolted on the basic AC_PROG_CC. What does this mean? Mostly that there is one fewer line to add to your when you use subdir-objects, and if you don’t use subdir-objects today, you should. It also means that the compile script is now needed by all automake projects.

The only one new feature that I think is worth the release, is better support for including files within — this allows the creation of almost independent “module” files so that your build rules still live with the source files, but the final result is non-recursive. The changes make Karel’s way much more practical, to the point I’ve actually started writing documentation for it in Autotools Mythbuster.

# src/

bin_PROGRAMS += myprog
man_MANS += %D%/myprog.8
myprog_SOURCES = %D%/myprog.c 

The idea is that instead of knowing exactly what your subdirectory is that contains the sources, you can simply use %D% (or reldir) and then you can move said directory around. It makes it possible to properly handle a bundled-but-optout-capable library so that you don’t have to fight too much with the build system. I think that’ll actuall be the next post in the Autotools Mythbuster series, how to create a library project with a clear bundling path and, at the same time, the ability to use the system copy of the library itself.

Anyway, let’s all thank Stefano for a likely uneventful automake release. Autotools Mythbuster is being updated, for now you can find up to date forward porting notes but before coming back from vacation I’ll most likely update a few more sections.

The status of Blender

So after my recent complaints on the way Blender is packaged upstream, it’s a probably a good idea to see what the current status on the topic is.

First of all, upstream has been at least discussing how to deal with these kind of complains: while some commenters complained about leaving Gentoo because of our decision of not bumping to 2.65a (yet), with the idea that it’d be much easier to have Blender on Debian, Arch Linux or whatever else, it turns out that Gentoo was not alone having trouble with Blender, and indeed Matteo asked our help with patching at least for libav-9 support.

For what concerns Gentoo, while I keep getting bugs requesting an update to version 2.65a, I’ve been basically closing them every time, as none seem to care about getting it right — and I really don’t want to get a crappy ebuild in, as I’d be the one taking the pieces anyway. Mostly, what we need is a version of Blender ebuild that does use CMake, but also does not use the bundled libraries for all the code we have in the system already. The main issue here is Bullet, which requires a version bump, possibly with a pre-release snapshot of 2.82, due to the patches that are applied on top of the copy that comes with Blender.

Today I actually had to shoot down a request for a live ebuild; due to the quantity of patches that we end up having to apply, we’re not going to get a live ebuild, full stop.

Unfortunately, this also left us for a long time to deal with the old, buggy, and bitrotting version 2.49b which was marked stable. That stopped today as, with the agreement of at least some of the arch team members, I masked Blender altogether and got rid of version 2.49b-r2 and its patches from the tree. If you do want to use Blender now, you’ve got to unmask it. While this could be considered like dropping the ball on it, it’s just making it explicit that we haven’t been supporting version 2.49b for a long time already.

No, don’t ask for it to be re-added slotted. Upstream is not maintaining a 2.4x branch, so we won’t be doing that either.

So right now if you want to help, start by preparing (upstreamable, or even better, upstreamed) patches that allows to select with CMake the use of system libraries for most of the bundled ones. Another thing that would be very useful at this point would be a separate ebuild for libmv, even with the bundled libraries to start with, so that would at least stop the multi-level madness and we would end up with good old single-level bundled libraries.

Bloody upstream

Please note, this post is likely to be interpreted as a rant. From one point of view it is. It’s mostly a general rant geared toward those upstreams that is generally impossible to talk into helping us distribution out.

The first one is the IEEE — you might remember that back in April I was troubled by their refusal to apply a permissive license to their OUI database, and actually denied that they allow redistribution of said database. A few weeks ago I had to bite the bullet and added both the OUI and the IAB databases to the hwids package that we’re using in Gentoo, so that we can use them on different software packages, including bluez and udev.

While I’m trying not to bump the package as often as before, simply because the two new files increase the size of the package four times. But I am updating the repository more often so that I can see if something changes and could be useful to bump it sooner. And what I noticed is that the two files are managed very badly by IEEE.

At some point, while adding one entry to the OUI list, the charset of the file was screwed up, replacing the UTF-8 with mojibake then somebody fixed it, then somebody decided that using UTF-8 was too good for them and decided to go back to pure ASCII, doing some near-equivalent replacement – although whoever changed ß to b probably got to learn some German – then somebody decided to fix it up again … then again somebody broke it while adding an entry, another guy tried to go back to ASCII, and someone else fixed it up again.

How much noise is this in the history of the file? Lots. I really wish they actually wrote a decent app to manage those databases so they don’t break them every other time they have to add something to the list.

The other upstream is Blender. You probably remember I was complaining about their multi-level bundling ad the fact that there are missing license information for at least one of the bundled libraries. Well, we’re now having another problem. I was working on the bump to 2.65, but now either I return to bundle Bullet, or I have to patch it because they added new APIs to the library.

So right now we have in tree a package that:

  • we need to patch to be able to build against a modern version of libav;
  • we need to patch to make sure it doesn’t crash;
  • we need to patch to make it use over half a dozen system libraries that it otherwise bundles;
  • we need to patch to avoid it becoming a security nightmare for users by auto-executing scripts in downloaded files;
  • bundles libraries with unclear licensing terms;
  • has two build systems, with different features available, neither of which is really suitable for a distribution.

Honestly, I reached a point where I’m considering p.masking the package for removal and deal with those consequences rather than dealing with Blender. I know it has quite a few users especially in Gentoo, but if upstream is unwilling to work with us to make it fit properly, I’d like users to speak to them to see that they get their act together at this point. Debian is also suffering from issues related to the libav updates and stuff like that. Without even going into the license issues.

So if you have contacts with Blender developers, please ask them to actually start reducing the amount of bundled libraries, decide on which of the two build systems we should be using, and possibly start to clear up the licensing terms of the package as a whole (including the libraries!). Unfortunately, I’d expect them not to listen — until maybe distributions, as a whole, decide to drop Blender because of the same reasons, to make them question the sanity of their development model.

The myth of the perfectionist QA

There is a bad trend going on that seems to purport Gentoo’s QA (which for the past couple of years has meant mostly me alone), as a perfectionist hindering getting stuff done — and I call all of this utter bull feces. It’s probably telling that the people who seem to then expect that every single issue has an explicit written rule, with no leeway for different situations.

So let me give you some insights so that you can actually get a realistic clue of what’s going on. Rich is complaining that I made it a task of mine to make sure that the software in Portage don’t use bundled libraries; for some reasons, he seems to assume that I have no clue on how cumbersome it is to deal with said bundled libraries, and he couldn’t be more wrong. You know what my first bundled libraries project? xine. And it took me quite a long time to get rid of (almost all) the bundled libraries; even more so because many of said libraries were more or less heavily modified internally. Was it worth it? Totally. It actually made the xine package much more reliable to security issues (and there had been quite a few), as well as solving a number of other issues, including the most infamous problem with the symbols’ collisions between the system’s libfaad (brought in by ffmpeg) and the internal copy of it.

So, I know it’s a lot of work, and I know it’s not always a task for the faint of heart, and most of the times there is no immediate improvement by fixing those. Why am I insisting on it as a point of policy? Because we have quite a few examples with software being bundled for which vulnerabilities were found and could be leveraged by an attacker, especially for what concern zlib and software that either downloads or receives compressed content over the network.

So from what Rich wrote, we’re trying to hinder getting stuff in tree by refusing it if it bundles libraries. That’s so much of a lie that it’s not even funny. We have had packages entering the tree with bundled libraries, and I’m pretty sure there still is a ton of it. Heck, do you remember what I wrote about Blender just last month? Blender has a number of bundled libraries, and yet it’s in tree, I maintain it, and it’s going stable from time to time.

What is important in that scene? That I’m trying to get rid of said libraries. The same applies to Chromium and most other packages that have a ton of bundled libraries; most packages are responsible enough, and generally know enough about the package that they can work on getting rid of said libraries, if that’s feasible at all — in the case of Chromium it’s an extremely difficult task I’m sure, mostly because upstream does not care the least, and that was clear at last VDD when we were discussing the patches applied over ffmpeg/libav.

So let’s get into the specific details for the complains, as Rich’s “just an example” is not an example of what happens at all. There is a server software written by Logitech for their SqueezeBox devices, that is now called logitechmediaserver-bin in Gentoo. Said package has been known to be a bundled for years — Logitech is bundling a large set of CPAN modules with it, as it seems like the bulk of the code is written in Perl or something like that. I’ve known it at quite a few versions to bundle a CPAN module that, in turn, bundled an old copy of zlib, one that is vulnerable to a few different issues. Right now, it’s not only bundling a bunch of modules, but I found out that it installs a number of files for architectures that are completely incompatible with your system (i.e. a PowerPC binary on amd64). This bothered me quite a lot more. Why? Because it means that the (proxy) maintainer is not discriminating at all, and just installing whatever the upstream package comes with. Guess what? That’s not a good way to package the software.

When the proxy maintainer is not doing stuff that gets even nearby the quality level of most of the rest of the tree, and the Gentoo developer who should be proxying him ignoring the problems, things are messy enough. But it gets worse when you get a person known for bad patches sold as fixes from over three years ago, who still expect to be able to call the shots.

If you see bug #251494 he’s actually suggesting marking the package stable because it’s not going to run with the current unstable Perl which is a completely backward reason (if it can’t work with the latest Perl, it means that it hasn’t been tested in testing for a long time), and is going to create more trouble later on (the moment when the new Perl goes stable, we have a known broken, unfixable package in stable). But he’s Daniel Robbins, and some people expect that that’s enough for him to be right — if that was the case, I suppose he’d still be the Gentoo Linux lead, and not just of a wannabe project.

Anyway, here’s the deal: QA policies are more like a guideline, in general. On the other hand, there is no reason why we should be forced to mark things stable if they do not follow the common QA policies — especially for proprietary software, and software requiring particular hardware, marking things stable is not that great an idea, as they tend to be orphaned quite easily if a single developer retires. We already have quite enough packages stable in x86, ppc and sparc that are not really able to run, because they were always broken, but has been stabled in ancient times. Sometimes, we even have packages keyworded that cannot be used on a given arch, but they do build, and the failure would only happen at runtime, so they have been, again in ancient times, keyworded.

Maybe this is what people who want to follow Daniel expect: coming back to the “it builds, ship it!” mentality that made Gentoo a joke in the eyes of almost everybody at the time. For sure, it’s not what I want, and I don’t think it’s what the users as a whole group want or need.

Multi-level bundling, with a twist

I spent half my Saturday afternoon working on Blender, to get the new version (2.64a) in Portage. This is never an easy task but in this case it was a bit more tedious because thanks to the new release of libav (version 9) I had to make a few more modifications … making sure it would still work with the old libav (0.8).

FFmpeg support is not guaranteed — If you care you can submit a patch. I’m one of the libav developers thus that’s what I work, and test, with.

Funnily enough, while I was doing that work, a new bug for blender was reported in Gentoo, so I looked into it and found out that it was actually caused by one of the bundled dependencies — luckily, one that was already available as its own ebuild, so I just decided to get rid of it. The interesting part was that it wasn’t listed in the “still bundled libraries” list that the ebuild’s own diagnostic prints… since it was actually a bundled library of the bundled libmv!

So you reach the point where you get one package (Blender) bundling a library (libmv) bundling a bunch of libraries, multi-level.

Looking into it I found out that not only the dependency that was causing the bug was bundled (ldl) but there were at least two more that, I knew for sure, were available in Gentoo (glog and gflags). Which meant I could shave some more code out of the package, by adding a few more dependencies… which is always a good thing in my book (and I know that my book is not the same as many others’).

While looking for other libraries to unbundle, I found another one, mostly because its name (eltopo) was funny — it has a website and from there you can find the sources — neither are linked in the Blender package. When I looked at the sources, I was dismayed to see that there was no real build system but just an half-broken Makefile building two completely different PIC-enabled static archives, for debug and release. Not really something that distributions could get much interest in packaging.

So I set up at build my usual autotools-based build system (which no matter what people say it’s extremely fast, if you know how to do it), fix the package to build with gcc 4.7 correctly (how did it work for Blender? I assume they patched it somehow but they don’t write down what they do!), and .. uh where’s the license file?

Turns out that while the homepage says that the library is “public domain”, there is no license statement anywhere in the source code, making it in all effects the exact opposite: a proprietary software. I’ve opened an issue for it and hopefully upstream will fix that one up so I can send him my fixes and package it in Gentoo.

Interestingly enough, the libmv software that Blender packages, is much better in its way of bundling libraries. While they don’t seem to give you an easy way to disable the bundled copies (which might or might not be Blender’s build system fault), they make it clear where each library come from, and they have scripts to “re-bundle” said libraries. When they make changes, they also keep a log of them so that you can identify what changed and either ignore, patch or send it upstream. If all projects bundling stuff did it that way, it would be a much easier job to unbundle…

In the mean time, if you have some free time and feel like doing something to improve the bundled libraries situation in Gentoo Linux, or you care about Blender and you’d like to have a better Gentoo experience with it, we could use some ebuilds for ceres-solver and SSBA as well as fast-C (this last one has no buildsystem at all, unfortunately) all used by libmv, or maybe carve libredcode (for which I don’t even have an URL at hand), recastnavigation (which has no releases) which are instead used directly by Blender.

P.S.: don’t expect to see me around this Sunday, I’m actually going to see the Shuttle, and so I won’t be back till late, most likely, or at least I hope so. You’ll probably see a photo set on Monday on my Flickr page if you want to have a treat.

Reliable and unreliable detection of bundled libraries

Tim asked me for some help to identify bundled libraries, so I come back to a topic that I haven’t written about in quite a while, mostly because it seemed like a lost fight with too many upstream projects.

First of all let’s focus on what I’m talking about. With the term “bundled libraries” I’m referring to libraries whose sources are copied verbatim, or only slightly modified, within another project. This is usually done by projects with either a number of dependencies, or uncommon dependencies, under the impression it makes it easier for the users to build the project. I maintain that such notion is mostly obsolete as it’s not the place of most random users to actually build their own projects, leaving that job to distributions, which have the inverse problem with bundled libraries: they make packaging harder, and software much more vulnerable to security issues.

So how do you track down bundled libraries issues? Well one option is to check out each packages’ sources, but that’s not really applicable that easily, especially since the sources might be masked — I found more than one project taking a library originally in a number of source files, concatenating all of them together, and then building it in a single object file. The other options relate to analyse the result of the build, executables and libraries. This is what I’ve done with my scripts when I started looking at a different, but slightly related, problem (symbol collisions).

You can analyse the resulting binaries in many ways; the one I’ve been doing with the scripts I’m referring to above is a very shallow detection: it wasn’t designed for the task of looking for bundled libraries, it was designed to identify libraries and executables exporting the same symbols, that would cause trouble if they were ever loaded in the same address space. For those curious, this is because the dynamic loader needs to choose only one symbol for any name, so either the libraries or the executable would end up calling the wrong function or accessing the wrong data. While this can sound too theoretical, one such bug I reported back in 2008 was the root cause of a runtime crash of Evince in 2011.

At any rate this method is not optimal for the task of finding bundled libraries for two reasons: the first is that it only checks for identical symbol names, so it doesn’t take into account libraries that are bundled and have their method renamed – and yes I have seen that done – the second is that it only works with exported symbols, that are those that the dynamic loader can collide on. What is not included here are the so-called local symbols: static symbols, symbols declared with private ELF visibility, and those that are hidden by linker scripts at link time.

To inspect the binaries deeper, you need non-stripped copies; it doesn’t really require you to have them built with DWARF data in them (the debug data that is added when building with, for instance, -ggdb); it can work even just the complete symbol table kept in the .symtab section of final binaries (and usually stripped away). To get the list of all symbols present within a binary, be them data (constants and variables) or code (functions), you can simply use the nm --defined-only command (if you add --dynamic you end up with the same data I’m analysing with my script above, as it changes the table that it uses to look up symbols).

Unfortunately this still requires to find a way to match the symbols even with prefixed versions, and with little false positives; this is why I haven’t worked on a script to deal with this kind of data yet. Although while writing this I can think of a way to make it at least scan for a particular library against a list of executables, that is not really one of the best-performing options available. I guess the proper answer here would be to find a way to generate a regular expression for each library based on the list of symbols they include, and then grep for that over the symbols exported by the other binaries.

Note here: you can’t expect that all the symbols of a bundled library are present in the final binary; when linking against object files or static archives the linker applies logic akin to the --as-needed one and will not bring in object files if no symbol from that file is referenced by the code; if you use other techniques then you can even skip single functions and data symbols that are unused by your code. Bottom line is that even if the full sources of another library are bundled and built, the final executable might not have all of them on it.

If you don’t have access to a non-stripped binary, well then your only option is to run strings over the package and try to find matching strings for the library you’re looking for; if you’re lucky, the library provides version information you can still track down in raw strings. This unfortunately is the roughest option you have available and I’d not suggest doing it for anything you have sources for; it’s a last-resort for proprietary, pre-built software.

Finally, let me say a couple of words regarding identify the version of an exported version of a library, which I have found to be done way too many times. If this happens on an ET_DYN file, such as a shared object, you can easily dlopen() it and then call functions like zlibVersion() to get the actual version of the library embedded. This ends up being pretty important when trying to tell whether a package is bundling a vulnerable version of zlib or libpng, even though it is also not extremely reliable as it is.

A visible case against bundled libraries

I wrote a lot about bundled libraries and why they are bad, but I usually stick with speaking about Free Software (or Open Source Software — yeah they are two different sets!). This time, let me explain you how they are bad for proprietary, binary-provided software as well.

For a long-winded story, I finally decided to give a try to Dropbox after hearing so much good about it. Thanks to Fabiano who wrote it, I got a rough ebuild for the nautilus-dropbox extension, so I cleaned it up a bit and installed.

The first step of the set-up process is… downloading the “proprietary daemon”, which turns out to include a number of otherwise Free Software, including, but not limited to, a number of Python extensions (and of course, a _whole copy of the Python interpreter_… they don’t even as far as trying to hide it, as all the symbols are visible in the ELF file!), Zlib 1.2.3 and a couple of D-Bus related libraries.

Okay nasty, but let’s leave at that for now, proprietary software can easily be crap, and I say that as somebody who also works on proprietary software for jobs from time to time, and I don’t expect them to have too much intention to clean up their code for what Free Software developers would call quality. But on the other hand, I expect they’d be trying to make it possible to run their code on the widest range possible of systems.

So I ignored for now the fact that it installs a proprietary package without being allowed to package it properly in the distribution, and I went on to configure it. Only three other steps are involved in the setup process: logging in with your email address and password, choosing your subscription type (I went for “Free”… while I am/was considering getting a higher version, it’s try first and decide later!), and deciding where to put your Dropbox folder.

Call me old-styled but I like my GUI-important folders on the desktop, so I wanted to put it there alongside “Downloads” and “Documents”.

Whoops! Window disappear as soon as I click the button to choose the placement. Smells funny, let me open the console and see:

/home/flame/.dropbox-dist/dropbox: symbol lookup error: /usr/lib/ undefined symbol: dbus_g_proxy_set_default_timeout

A quick check tells me two things: the symbol is part of, which links to properly, and Dropbox has a local copy of that library, which overrides my system copy. Unfortunately, their copy is older, so it lacks that symbol. Hilarity ensures.

Speaking about which, please do remember that most libraries don’t change SONAME when they make backward-compatible ABI changes, but the same is not true for forward-compatibility! So you can use software linked against an older library with a new one, but not vice-versa. It’s a nasty thing to forget.

I decide to simply hack this around and remove their copy of the library and try again… this time it works, but I lack all possible icons. The reason? I’m using an SVG-based icon set; the SVG renderer uses libxml2; libxml2 uses Zlib… they ship with Zlib 1.2.3 but my libxml2 is looking for the versioning from 1.2.4. Thanks to the fact that it’s just a plugin, this time it doesn’t kill the process (yes this is one of the few good reasons to use plugins), and I just lose ability to see icons.

I’ve opened a ticket for this with Dropbox, but I don’t really want to get my hopes up. What I’d love to see them do (still keeping a pragmatic mind — I won’t expect them to opensource all their implementation since they have been keeping that explicitly proprietary up to now):

  • make it possible to package the daemon from the distribution level, so that it’s not downloaded and installed per-user;
  • allow for a quick way to remove the bundled libraries, bringing in on the system the needed dependencies;
  • possibly make available a version of the daemon that does not link in the whole Python interpreter so that distributions can use their own system interpreter.

So if somebody from Dropbox is listening: please work with us distributors. You only have to gain from that. Better experience with your service on our users’ desktops is likely going to bring you more money, as more people are likely to buy a subscription. For what it’s worth, I would buy one myself, if it worked as a charm and not in the current hackish way on my systems.

More details about symbol collisions

You might remember that I was quite upset by the amount of packages that block each other when they install the same files (or files with the same name). It might not be obvious – and actually it’s totally non-obvious – why I would insist on these to be solved whenever possible, and why I’m still wishing we had a proper way to switch around alternative implementations without using blockers, so I guess I’ll explain it here, while I also discuss again the problem of symbol collisions, which is also one very nebulous problem.

I have to say, first, that the same problem with blockers is present also with conflicting USE requirements; the bottom-line is the same: there are packages that you cannot have merged at the same time, and that’s a problem for me.

The problem can probably be solved just as well by changing the script I use for gathering ELF symbol information to check for collisions, but since that requires quite a bit for work just to work around this trouble, I’m fine with accepting a subset of packages, and ranting about the fact that we have no way (yet) to solve the problem of colliding packages, and incompatible USE requirements.

The script, basically, roams the filesystem to gather all the symbols that are exported by various ELF files, both executables and libraries, saves them into a PostgreSQL database, and then a separate script combines that data to generate a list of possibly colliding symbols. The whole point of this script is to find possible hard-to-diagnose bugs due to unwanted symbol interposing: an internal function or datum replaced with another from another ELF object (shared library or executable) that use the same name.

This kind of work is, as you can guess by now, hindered by the inability of keeping all the packages merged at once because then I cannot compare the symbols between them, and at the same time, it’s also hindered by those bad packages that install ELF files in uncommon and wrong paths (like /usr/share) — running the script over the whole filesystem would solve that problem, but the required amount of time to be wasted to run it is definitely going to be a problem.

On a different note, I have to point out that not all the cases where two objects export the same symbol are mistakes. Sometimes you’re willingly interposing symbols, like we do with the sandbox and google-perftools do with malloc and mmap; some other times, you’re implementing a specific interface, which might be that of a plugin but might also be required by a library, like the libwrap interface for allow/deny symbols. In some cases, plugins will not interfere one with the other because they get loaded with the RTLD_LOCAL option that tells the runtime loader that the symbols are not to be exported.

This makes the whole work pretty cumbersome: some packages will have to be ignored altogether, others will require specific rules, others will have to be fixed upstream, and some are actually issues with bundled libraries. The whole work is long, tedious and will not bring huge improvements all around, but it’s a nice polishing action that any upstream should really consider to show that they do care about the quality of their code.

But if I were to expect all of them to start actually making use of hidden symbols and the like, I’d probably be waiting forever. For now, I guess I’ll have to start with making use of these ideas on the projects I contribute to, like lscube. It might end up getting used more often.