Last week I have written in passing about my old linking-collision script. Since then I restarted working on it and I have a few comments to give again.
First of all, you might have seen the 2008 GhostScript bug — this is a funny story; back in 2008 when I started working on finding and killing symbol collisions between libraries and programs, I filed a bug with GhostScript (the AFPL version), since it exported a symbol that was present, with the same name, in libXfont and libt1. I found that particularly critical since they aren’t libraries used in totally different applications, as they are all related to rendering data.
At the time, the upstream GS developer (who happens to be one of the Xiph developers, don’t I just love those guys?) asked me to provide him with a real-world crash. Since any synthetic testcase I could come up with would look contrived, I didn’t really want to spend time trying to come up with one. Instead I argued the semantic of the problem, explaining why, albeit theoretical at that point, the problem should have been solved. No avail, the bug was closed at the time with a threat that anyone reopening it would have its account removed.
Turns out in 2011 that there is a program that does link together both libgs and libt1: Evince. And it crashes when you try to render a DVI document (through libt1), containing an Encapsuled PostScript (EPS) image (rendered through GhostScript library). What a surprise! Even though the problem was known and one upstream developer (Henry Stiles) knows that the proper fix is using unique names for internal functions and data entries, the “solution” was limited to the one colliding symbol, leaving all the others to be found in the future to have problems. Oh well.
Interestingly, most packages don’t seem to care about their internal symbols, be them libraries or final binaries. On final binaries this is usually not much of a problem, as two binaries cannot collide with one another, but it doesn’t mean that the symbol couldn’t collide with another library — for this reason, the script now ignores symbols that collide only between executables, but keeps listing those colliding with at least one library.
Before moving on to how to hide those symbols, I’d like to point out that the Ruby-Elf project page has a Flattr button, while the sources are on Gitorious GitHub for those who are curious.
Update (2017-04-22): as you may know, Gitorious was acquired by GitLab in 2015 and turned down the service. So the project is now on GitHub. I also stopped using Flattr a long time ago.
You can now wonder how to hide the symbols; one way that I often suggest is to use the GCC-provided -fvisibility=hidden
support — this is obviously not always an option as you might want to support older versions, or simply don’t want to start adding visibility markers to your library. Thankfully there are two other options you can make use of; one is to directly use the version script support from GNU ld
(compatible with Sun’s, Apple’s and gold for what it’s worth); basically you can then declare something like:
{
global:
func1;
func2;
func3;
local: *;
}
This way only the three named functions would be exported, and everything else will be hidden. While this option works quite nicely, it often sounds too cumbersome, mostly because version scripts are designed to allow setting multiple versions to the symbols as well. But that’s not the only option, at least if you’re using libtool.
In that case there are, once again, two separate options: one is to provide it with a list of exported symbols, similar to the one above, but with one-symbol-per-line (-export-symbols SYMBOL-FILE
), the other is to provide a regular expression of symbols to export (-export-symbols-regex REGEX
), so that you just have to name the symbols correctly to have them exported or not. This loses the advantage of multiple versions for symbols – but even that is a bit hairy so I won’t get there – but gains the advantage of working with generating Windows libraries as well, where you have to list the symbols to export.
I’d have to add here that hiding symbols for executables should also reduce their startup time, as the runtime loader (ld.so
) doesn’t need to look up a long list of symbols when preparing the binary to be executed; the same goes for libraries. So in a utopia world where each library and program only exports its tiny, required list of symbols, the system should also be snappier. Think about it.
As a followup, I’ve now “documented in Autotools Mythbuster”:https://autotools.info/libtoo… the libtool options I named here, so that there is a stable reference for them.
A very important advantage of -fvisibility=hidden vs linker-based solutions is enabling compiler to perform more agressive optimization by default (notably inlining and cloning). Clang compiler does this by default (thus violating interposition semantics) but GCC shared lib compilations are significantly limited by default.