So, after my post yesterday, I understood that the script I wrote was too limited, as the readelf parsing couldn’t allow me to make sure not to consider two libraries with different ABIs (like x86 and amd64) as the same, colliding library.
Unfortunately, there was no tool that gave me the output I was looking for; the nearest thing was scanelf, but it has two problems: while looking for all symbols, it won’t abide -F
option and the symbols not always contain the version string.
So I decided that the easiest way to fix all this was to write my own ELF parser; in Ruby, of course.
It was a nice task after all, quite instructive, and useful to understand how ELF works behind the curtains; I didn’t write a complete parser yet, but it parses enough data to complete my script, and also allowed me to re-implement nm -pD
command in pure Ruby. Okay it’s slower to run it, but it was a nice proof of concept.
The script, now that can read the files on its own rather than having to open a new process per every file, is quite faster; running it on tmpfs also makes it a lot faster than on disk, as sqlite writes on the file pretty often (the file itself never takes over 20MB on my system, so having it in RAM is a good choice, I have 2GB here anyway). I can also get the symbols’ versions and the ABI of the library, so that I don’t have to skip over 32-bit libraries, they don’t collide anymore.
The suppression file still has to be improved, and there are a few corner cases that aren’t fixed yet, so the script bails out, but I’ll work on them soon.
Now as I promised, the source code of all this is public and available on GitHub .