A special kind of bundling

I know it was a very long time since I last posted about bundled libraries, and a long time since I actually worked on symbol collisions which is the original reason why I started working on Ruby-Elf — even though you probably wouldn’t tell nowadays, given how many more tools I implemented over the same library.

Since the tinderbox was idling, due to the recent changes in distfiles storage I decided to update the database of symbols. This actually discovered more than a few bugs in my code, for which I should probably write a separate blog post. In the mean time I’m just going to ask here what I already asked on my streams over to identi.ca and Twitter: if you have access to an HP-UX machine, could you please check if there is an elf.h header, with a license permissive enough that I can look at it? I could fetch the entries from GNU binutils, but I’d rather not, since it’ll be mixing and matching code licensed under GPL-2 (only) and GPL-3 — although arguably constant names shouldn’t be copyrightable.

The Ruby-Elf code will be pushed tomorrow, as today gitorious wasn’t very collaborative, and I’ll probably follow with a blog post on the topic anyway.

Once I was able to get the output of the harvest and analyse script, I found an interesting, albeit worrisome, surprise. A long list of packages use gnulib’s MD5 code. The problem is that gnulib is not your average utility library: it isn’t installed and linked to, it is imported into the sources of the project using it. The original reason to design it this way was that it would provide replacement functions for the GNU extension interfaces, or even standard interfaces, that aren’t present on some systems, so that you wouldn’t have to stick to a twenty year old standard when you could have reduced the code logic by using modern interfaces.

What happens now? Well, gnulib carries not only replacement code for functions that are implemented in glibc abut not on other systems, but also a long list of interfaces that are not implemented in glibc either! And as it happens, even an MD5 implementation. Such an implementation is replicated at least 115 times into the tinderbox system, standing to the visible symbols — there might be a lot more, for when you hide the symbols or build a non-exported executable, my tools are not going to find them.

This use of gnulib is unlikely to go away anytime soon… unfortunately the more packages use gnulib, the more a security bug in gnulib would easily impact the distribution as a whole for a very long time. People, can we stop using gnulib like it was glib? Please? Just create a libgnutils or something, and make gnulib look for that before using its own modules, so that there is a chance to use a shared, common library instead… especially since some of the users of gnulib are libraries themselves, which cause the symbol collisions problem that is the original reason why I’m looking at this code…

Sigh! Rant out….