This Time Self-Hosted
dark mode light mode Search

A special kind of bundling

I know it was a very long time since I last posted about bundled libraries, and a long time since I actually worked on symbol collisions which is the original reason why I started working on Ruby-Elf — even though you probably wouldn’t tell nowadays, given how many more tools I implemented over the same library.

Since the tinderbox was idling, due to the recent changes in distfiles storage I decided to update the database of symbols. This actually discovered more than a few bugs in my code, for which I should probably write a separate blog post. In the mean time I’m just going to ask here what I already asked on my streams over to identi.ca and Twitter: if you have access to an HP-UX machine, could you please check if there is an elf.h header, with a license permissive enough that I can look at it? I could fetch the entries from GNU binutils, but I’d rather not, since it’ll be mixing and matching code licensed under GPL-2 (only) and GPL-3 — although arguably constant names shouldn’t be copyrightable.

The Ruby-Elf code will be pushed tomorrow, as today gitorious wasn’t very collaborative, and I’ll probably follow with a blog post on the topic anyway.

Once I was able to get the output of the harvest and analyse script, I found an interesting, albeit worrisome, surprise. A long list of packages use gnulib’s MD5 code. The problem is that gnulib is not your average utility library: it isn’t installed and linked to, it is imported into the sources of the project using it. The original reason to design it this way was that it would provide replacement functions for the GNU extension interfaces, or even standard interfaces, that aren’t present on some systems, so that you wouldn’t have to stick to a twenty year old standard when you could have reduced the code logic by using modern interfaces.

What happens now? Well, gnulib carries not only replacement code for functions that are implemented in glibc abut not on other systems, but also a long list of interfaces that are not implemented in glibc either! And as it happens, even an MD5 implementation. Such an implementation is replicated at least 115 times into the tinderbox system, standing to the visible symbols — there might be a lot more, for when you hide the symbols or build a non-exported executable, my tools are not going to find them.

This use of gnulib is unlikely to go away anytime soon… unfortunately the more packages use gnulib, the more a security bug in gnulib would easily impact the distribution as a whole for a very long time. People, can we stop using gnulib like it was glib? Please? Just create a libgnutils or something, and make gnulib look for that before using its own modules, so that there is a chance to use a shared, common library instead… especially since some of the users of gnulib are libraries themselves, which cause the symbol collisions problem that is the original reason why I’m looking at this code…

Sigh! Rant out….

Comments 3
  1. I agree, but I think that ranting on a blog is not going to change anything. You need to talk to gnulib upstream about transitioning to a shared library interface.

  2. I doubt they’ll ever go that route, it’s just not what they intend gnulib to be.The problem is that from what they were supposed to do with gnulib, they came a long way and not all good.The best suggestion I can give is to make sure that software uses teh libraries that they should be using instead of relying on gnulib modules; in the case of MD5, it should be trivial to make the various projects use libgcrypt, if they are collaborative.In some cases (such as GCC) that is likely not possible, but there are still many many packages with that problem… I’ll see to publish a list and hope to contact the various upstreams.As for the uselessness of the rant.. I disagree: this _will_ show up in Google search results, and hopefully it’ll be dissuading people from further (ab)using gnulib…

  3. Does anyone know if a compat library for things like asprintf() and the strl family of functions exists already? I made one up ( “libstrl”:http://ohnopub.net/libstrl , “dev-libs/libstrl”:http://packages.gentoo.org/… , “strl.h”:http://ohnopub.net/~ohnobin… ) myself a year ago because I was annoyed about this same problem. And because I don’t want to learn how to integrate gnulib into a project ;-).Maybe I should ask, is my approach with this compat library correct? I directly export the symbol named strlcpy, for example, instead of using macros to rewrite it to strl_strlcpy — would something like this be beneficial or just create more problems? I assume that if the libc doesn’t have a function anyways, and for reentrant functions like the ones in this library, using the original symbol name shouldn’t be a big deal (if my implementation is any good, that is ;-)).

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.