I’ve wrote a lot about my linking collisions script that also shown the presence of internal copies of libraries in binaries. It might not be understood that this is just a side-effect, and that the primary scope of my script is not to find the included libraries, but rather to find possible collisions between two software with similar symbols and no link between them. This is what I found in Ghostscript bug #689698 and poppler bug #14451 . Those are really bad things to happen and that was my first reason for writing the script.
One reason why this script cannot be used with discovery of internal copies of libraries as main function is that it will not find internal copies of libraries if they have hidden visibility, which is a prerequisite for properly importing an internal copy of whatever library (skipping over the fact that is not a good idea to do such an import).
To find internal copies of libraries, the correct thing to do is to build all packages with almost full debug information (so -ggdb), and use the dwarf data in them to find the definition of functions. These definitions won’t disappear with hidden visibility so they can be relied upon.
Unfortunately parsing DWARF data is a very complex matter, and I doubt I’ll ever add DWARF parsing support to ruby-elf, not unless I can find someone else to work with me on it. But there is already a toolset that you can use for this: dwarves (dev-util/dwarves
). I haven’t written an harvesting and analysis tool yet, and I’m just wasting a lot of CPU cycles to scan all the ELF files for single functions, at the moment, but I’ll soon write something for that.
The pfunct
tool in dwarves allows you to find a particular function in a binary file, I ran pfunct over all the ELF files in my system, looking for two functions up to now: adler32
and png_free
. The first is a common function from zlib, the latter is, obviously, a common function from libpng. Interestingly enough, I found two more packages that use an internal copy of zlib (one of which is included in an internal copy of libpng): rsync and doxygen.
It’s interesting to see how a base system package like rsync is suffering from this problem. It means that it’s not just uncommon libraries to be bundled by remotely used programs, but also widely known and accepted software to include omnipresent libraries like zlib.
I’m now looking for internal copies of popt, which I’ve also seen imported more than a couple of time by software (cough distcc
cough), and is a dependency of system packages already. The problem is that dwarf parsing is slow and takes time for pfunct to scan all the system. That’s why I should use another harvest and an analyse script.
Oh well, more analysis for the future 🙂 And eliasp, when I’ve got this script done, then I’ll likely accept your offer 😉