I’m not sure why that happens, but almost every time I’m working on something, I find something different that diverges my attention, providing me with more insight on what is going wrong with software all over. This is what causes me to have a TODO list that continuously grows rather than stopping for a while.
Today, while I went on working on my dependency checker script which for now does not even exist, but for which I added a few more functions to Ruby-Elf (you can check them out if you wish), I’ve started noticing some interesting problems with the library search path of my tinderbox chroot.
The problem is that I got 47 entries in the
/etc/ld.so.conf file, which means 47 paths that needs to be scanned, and yet I have two entries crated by the env files as
LD_LIBRARY_PATH and yet I find bundled libraries with the Portage bash trigger that my collision detection script misses. This let me understand there is more to that than I have been seeing up to now, and got me to dig deeper.
So I checked out some of the entries in the
ld.so.conf file, some of them are due to multiple slot per library for different library versions, like QCA does. This is one decent way to handle the problem, by making two libraries differing just by soname in different directories, and passing the proper
-L flag to get one or the other. A much saner alternative would be to ensure that two libraries with different API would get different version names, just like glib 2.0 gets a
libglib-2.0.so name, but that’s a different point altogether.
Some other entries are more worrisome. For instance, the cuda packages install all their libraries in
/opt/cuda/lib, but the profiler installs there also Qt4 libraries, causing them to appear on the system library list, even though with smaller priority than the copy as installed by the Qt ebuilds. Similar things happen with the binary packages of Firefox and Thunderbird, since I have the two of them together with xulrunner, adding their library path to the system library path.
Instead of doing this, there are a few packages that install the libraries outside the library path, and then use a launcher script, whose main use is setting the
LD_LIBRARY_PATH variable to make sure the loader can find their libraries. This is what, for instance,
/usr/bin/kurso does. Indeed if you start the kurso (closed-source) executable directly you’re presented with a failure:
# /opt/kurso/bin/kurso3 /opt/kurso/bin/kurso3: symbol lookup error: /opt/kurso/bin/kurso3: undefined symbol: initPAnsiStrings
The symbol in question is defined in the libborqt library as provided by Kurso itself, which is loaded through
dlopen() (and thus does not appear in neither
scanelf -n output, just so I remember you of this problem). I could now write a bit around the amount of software written using Kylix and the amount of copies of libborqt in the system, each with its own copy of libpng and zlib, but that’s beside the point for now. The problem is that it indeed does not work by just starting it directly, and thus a wrapper file is created.
These wrappers are a necessary evil for proprietary closed-source software, but are quite obnoxious when I see them for Free Software that just gets build improperly; that’s the case both for the binary packages of Firefox and similar and the source packages, since my current
/usr/bin/firefox contains this:
#!/bin/sh export LD_LIBRARY_PATH="/usr/lib64/mozilla-firefox" exec "/usr/lib64/mozilla-firefox"/firefox "$`"
This has a minor evil in it (it overwrites my current
LD_LIBRARY_PATH settings) and a nastier one, it really isn’t much of a good idea because then also the programs it will call will get the same value, unless it’s unset later on. In general this is not a tremendously bright idea since we do have a method, the use of
DT_RUNPATH in ELF files, that allows us to tell an executable to find the libraries in a directory that is not in the path. This is tremendously useful, and should really be used much more, since setting
LD_LIBRARY_PATH also means skipping over the loader cache to find the libraries for all the processes involved rather than just for the (direct) dependency of a single binary. You can see that this has been a problem for Sun too.
For what it’s worth, the OpenSolaris linker, provides special padding space since two years ago to allow changing the runpath of executable files you don’t have access to (like the kurso binary above). It’s too bad that we have to deal with software lacking such padding because it would be quite useful too.
Is this all? Not really no, since you also get env.d entries for packages like quagga that seems to install some libraries in a sub-directory of the standard library directory and then adding that to the set of scanned directories. I haven’t investigated much about it yet, this blog post was intended as just a warning call, and I’ll probably try to work further on this to make it work better, but in general if your package installs internal “core” libraries, that you rightly don’t want into the standard library path (to avoid polluting the search path), you should not add it to
ld.so.conf through environment files, but should rather use the
-rpath option during link to tell it to find the library in a different path.
For developers, if your package does some funny thing with its libraries and you’re not sure if you’re handling it properly, may I just ask you please to tell me about it? I’d be glad to look into it, and eventually post a full explanation of how it works in the blog so that it can be used as reference for the future. I also have a few very nasty issues to blog about on Firefox and xulrunner, but I’ll save those for another time, for now.