More details about symbol collisions

You might remember that I was quite upset by the amount of packages that block each other when they install the same files (or files with the same name). It might not be obvious – and actually it’s totally non-obvious – why I would insist on these to be solved whenever possible, and why I’m still wishing we had a proper way to switch around alternative implementations without using blockers, so I guess I’ll explain it here, while I also discuss again the problem of symbol collisions, which is also one very nebulous problem.

I have to say, first, that the same problem with blockers is present also with conflicting USE requirements; the bottom-line is the same: there are packages that you cannot have merged at the same time, and that’s a problem for me.

The problem can probably be solved just as well by changing the script I use for gathering ELF symbol information to check for collisions, but since that requires quite a bit for work just to work around this trouble, I’m fine with accepting a subset of packages, and ranting about the fact that we have no way (yet) to solve the problem of colliding packages, and incompatible USE requirements.

The script, basically, roams the filesystem to gather all the symbols that are exported by various ELF files, both executables and libraries, saves them into a PostgreSQL database, and then a separate script combines that data to generate a list of possibly colliding symbols. The whole point of this script is to find possible hard-to-diagnose bugs due to unwanted symbol interposing: an internal function or datum replaced with another from another ELF object (shared library or executable) that use the same name.

This kind of work is, as you can guess by now, hindered by the inability of keeping all the packages merged at once because then I cannot compare the symbols between them, and at the same time, it’s also hindered by those bad packages that install ELF files in uncommon and wrong paths (like /usr/share) — running the script over the whole filesystem would solve that problem, but the required amount of time to be wasted to run it is definitely going to be a problem.

On a different note, I have to point out that not all the cases where two objects export the same symbol are mistakes. Sometimes you’re willingly interposing symbols, like we do with the sandbox and google-perftools do with malloc and mmap; some other times, you’re implementing a specific interface, which might be that of a plugin but might also be required by a library, like the libwrap interface for allow/deny symbols. In some cases, plugins will not interfere one with the other because they get loaded with the RTLD_LOCAL option that tells the runtime loader that the symbols are not to be exported.

This makes the whole work pretty cumbersome: some packages will have to be ignored altogether, others will require specific rules, others will have to be fixed upstream, and some are actually issues with bundled libraries. The whole work is long, tedious and will not bring huge improvements all around, but it’s a nice polishing action that any upstream should really consider to show that they do care about the quality of their code.

But if I were to expect all of them to start actually making use of hidden symbols and the like, I’d probably be waiting forever. For now, I guess I’ll have to start with making use of these ideas on the projects I contribute to, like lscube. It might end up getting used more often.

Exit mobile version