Library SONAME bumps and .la files: some visual clues

Before going on with the post, I’ll give users who’re confused by the post’s title some pointers on how to decipher it: I discussed .la files extensively before, and you can find a description of SONAMEs in another post of mine.

Long- and medium-time Gentoo users most likely remember what happened last time libpng was bumped last year, and will probably worry now that I’m telling them that libpng 1.5 is almost ready to be unmasked (I’m building the reverse dependencies in the tinderbox as we speak to see what breaks). Since I’ve seen through it with the tinderbox, I’m already going to tell you that it’s going to hurt, as a revdep-rebuild call will ask you to rebuild oh-so-many packages due to .la files that, myself, I’ll probably take the chance to move to the hardened compiler and run an emerge -e world just for the kicks.

But why is it this bad? Well, mostly it is the “viral propagation” of dependencies in .la files, which by itself is the reason why .la files are so bad. Since libgtk links to libcairo, and libcairo to libpng, any other library linking with libgtk will be provided with a -lpng entry to link to libpng, no matter whether it uses it or not. Unfortunately, --as-needed does not apply to libtool archives, so they end up overlinking, and only the link editor can drop the unused libraries.

For the sake of example, Evolution does not use libpng directly (the graphic files are managed through GTK’s pixbuf interface), but all of its plugins’ .la files will refer to libpng, which in turn means that revdep-rebuild will pick it up to rebuild it. D’oh!

So what about the visual clue? Well, I’ve decided to use the data from the gold based tinderbox to provide a graph of how many ELF objects actually link to the most common libraries, and how many libtool archives reference them. The data wasn’t easy to extract, mostly because at a first glance, the .la files seemed to be dwarfed by the actually linked objects.. until I remembered that ELF executable can’t have a corresponding .la file.

Library linking histogram

I’m sorry of some browsers might fail to display the image properly; please upgrade to a decent, modern browser as it’s a simple SVG file. The gnuplot script and the raw data file are also available if you wish to look at them.

The graph corroborates what I’ve been saying before, that the bump of libraries such as libexpat and libpng only is a problem because of overlinking and .la files. Indeed you can see that there are about 500 .la files listing either of the two libraries, when there are fewer than a hundred shared objects referencing them. And for zlib it’s even worse: while there are definitely more shared objects using it (348), there are four times as many .la files listing it as one of the dependencies, for no good reason at all.

A different story applies to GLib and GTK+ themselves: the number of shared objects using them is higher than the number of .la files that list them among their dependencies. I guess the reason here is that a number of their users are built with non-libtool-based build systems, and another good amount of .la files are removed by the less lazy Gentoo packagers (XFCE should be entirely .la free nowadays, and yes, it links to GTK+).

Now it is true that the amount of .la files and ELF files is not proportional to the number of packages installing them (for instance Evolution installs 24 .la files and 69 ELF objects), so you can’t really say much about the number of packages you’d have to rebuild when one of the three “virulent” libraries (libpng, libexpat, libz) is installed, but it should still be clear that marking five hundreds files as broken simply because they list a library that is gone, without their respective binary actually having anything to do with said library, is not the best approach we can have.

Dropping the .la file for libcairo (which is where libgtk picks it up) should probably make it much more resilient to the libpng bumps, which have proven to be the nastiest ones. I hope somebody will step up to do so, sooner or later.

Many minor issues make a big issue

All of you who follow my blog for a medium to long time knows that I often take a look to minor issues. These are issues that have no direct impact on users. Summing them up, though, can easily give you big issues that do have a direct impact on users.

Take my crusade about copy-on-write pages. They don’t really make any difference taken one by one, but sum them up in a system, and it might as well make a huge difference.

It’s similar to the issue with wakeups that Intel brought on the table. It does not make much difference on its own for a single software, but add all of them together and you have a huge difference.

I wonder if we’ll ever get to a point where all these issues are well known by all the free software developers, and that just the new started projects need to be profiled to identify these issues. Unfortunately, I’m afraid this is not going to happen anytime soon. But it would be really interesting if common projects like Samba and Xorg started paying more attention to those issues.

I wonder what’s the best option to improve the situation. I actually thought about starting a Wiki listing all the problems if the apps, and then attaching patches for possible resolutions. I would hate to load a Wiki up on my blog though. As an option, I was thinking about using GIT and Emacs’s Muse, rebuilding the pages statically after every commit. Then it would also be possible to add stuff like graphs and similar to show the reduction in memory usage, or startup time, or whatever, which would then allow showing more easily what the changes improved.

I think solar was right in suggesting me to have more visual output of the changes. I’ve installed R yesterday, and I hope to be able to try learning how to use it soon, so that I can propose some graphs.

Anyway, any comment on this is very welcome, as I start to be quite tired to fight these issues alone, and having almost no feedback from upstream.

Dear lazyweb, I need a gnuplot expert

And I end up asking again for help to whoever is around, this time I’m looking for a gnuplot expert :)

As solar suggested, I wanted to prepare a few graphs to show the changes with visibility, with fixing COW pages, and so on. While performance analysis through benchmark would probably be a good idea too, I wanted to start with something easier, maybe less interesting, but that would help me tackling down the issues to get better visual impact. This way the more important stuff will not just look like crap and be dismissed ;)

For now I’ve modified my parser for LD_DEBUG=bindings output to generate data that could be represented visually by gnuplot. Or at least I hope I’ll be able to make it be represented visually by gnuplot.

I’d like to have some clustered and row-stacked histograms; it would divide all the objects in a particular program (shared objects) in clusters of histograms, each containing N histograms for N runs to compare (no visibility, hidden/default visibility, hidden/protected visibility), then each histogram would have its height split in three (outgoing bindings, incoming bindings, self-fulfilled bindings), to show the changes in those.

Unfortunately, what I have up to now shows the graphics just fine, but the labels for the various objects are unreadable. If you want to get the package of the data and script here.

If somebody can help me to get these data graphed in a decent way… that would be very helpful :)