How _not_ to fix glibc 2.10 function collisions

Following my previous how not to fix post I’d like to explain also how not to fix the other kind of glibc 2.10 failure: function collisions.

With the new version of glibc, support for the latest POSIX C library specification is added; this means that, among other things, a few functions that previously were only available as GNU extensions are now standardised and, thus, visible by default unless requesting a strictly older POSIX version.

The functions that were, before, available as GNU extensions were usually hidden unless the _GNU_SOURCE feature selection macro was defined; in autotools, that meant using the AC_SYSTEM_EXTENSIONS macro to request them explicitly. Now these are visible by default; this wouldn’t be a problem if some packages didn’t decide to either reimplement them, or call their functions just like that.

Most commonly the colliding function is getline(); the reason for which is that the name is really too generic and I would probably curse POSIX committees for accepting it with the same name in the C library; I already cursed GNU for adding it with that name to glibc. With the name of getline() there are over tons of functions, with the most different interfaces, that try to get lines from any kind of media. The solution for these is to rename them to some different name so that the collision is avoided.

More interesting is instead the software that, wanting to use something alike to strndup() decide to create its own version, because some system do lack that function. In this case, renaming the functions, like I’ve seen one user propose today, is crazy. The system already provide the function; use that!

This can be done quite easily with autotools-based packages (and can be applied to other build systems, like cmake, that work on the basis of understanding what the system provides):

# in configure.ac
AC_SYSTEM_EXTENSIONS
AC_CHECK_FUNCS([strndup])

/* in a private header */
#include "config.h"

#ifndef HAVE_STRNDUP
char *strndup(const char *str, size_t len);
#endif

/* in an implementation file */
#ifndef HAVE_STRNDUP
char *strndup(const char *str, size_t len)
{
  [...]
}
#endif

When building on any glibc (2.7+ at least, I’d say), this code will use the system-provided function, without adding further duplicate, useless code; when building on systems where the function is not (yet) available, like FreeBSD 7, then the custom functions will be used.

Of course it takes slightly more time than renaming the function, but we’re here to fix stuff in the right way, aren’t we?

For A Parallel World. Theory lesson n.1: avoiding pointless autoconf checks

In my “For A Parallel World” series I have, up to now, only dealt with practical problems related with actual parallel builds. Today, I wish to try something different writing about some good practises to follow when working with autotools.

Once again I want to repeat that no, I don’t think autotools are completely flawed, or that they are way too complex for normal people to use; I certainly don’t agree that CMake is “just superior” like some good programmer said recently (although I admit after some stuff I’ve seen I’d gladly take CMake over some custom-built makefiles). I do think that there are too many bad examples of autotools usage, I do think that autotools could very well use better documentation and better tutorials, and I do think that a lot of people have been misusing autotools to the point that you blame the tool for how it was used.

You certianly know what the problem I’m referring to right now is, about parallel builds and autoconf: the lengthy and serialised execution of autoconf checks. It’s a very boring part of a build, but it’s unfortunately necessary, usually. What is the problem there though? The problem is that a lot of packages execute more tests than are actually needed, which is going to bore you to death since you’re waiting for things to be checked that you’re not going to have any useful result from.

The causes of this are quite varied: legacy systems being supported, overzealousness of the developer writing with respect to missing headers or functions, autoscan-like tools, boilerplate checks coming from a bastardised build systems (like the one forced upon each KDE 3 package), and of course mis-knowledge of autoconf. In addition to this, libtool 1.5 was very bad and checked for C++ and Fortran compilers, features, linkers and so on even though they were not to be used; luckily 2.2 is now fixed, and upstream projects are slowly migrating to the new version that takes much less time to run.

I’m currently looking for a way to scan a source tree to identify possibly overzealous checks from configure, so that I can help reducing the pointless checks myself, but in the mean time I want to give some indications that might help people to write better autotools-based buildsystems for their projects, at least.

My first suggestion is to require a standard base: I’m not referring to stuff like the Linux Standard Base, I’m referring to requiring a standard base for the language; in the case of C, which is mostly what autotools are used for (okay there’s also C++ but that’s not a topic I want to deal with right now), you can ensure that ou have some stuff present by requiring a C99-compliant compiler. You’re going to cut out some compilers, but C99 nowadays is pretty well supported under any operating system I ever dealt with, and even under Linux you can choose between three C99 compilers: GCC, Sun Studio Express and the Intel C Compiler. As you can guess, as long as you use C99 features, the compatibility between these three compilers is also almost perfect (there are some implementation-dependent things that vary, but if you avoid those it’s quite good).

But the important part here is that, by requiring C99 explicitly, you’re requiring also that the standard headers that come with that, which you don’t need to check for: stuff like locale.h, stdio.h, stdbool.h, stdlib.h, …; they have to be present for C99 to be supported. And even better, you can require POSIX.1-2001, and _XOPEN_SOURCE 600, so that you have a wide enough featureset you can rely upon. It’s very easy: -D_POSIX_C_SOURCE=200112 and -D_XOPEN_SOURCE=600, together with a C99-compatible compiler (like gcc -std=c99 or sunc99) and you can rely upon presence of functions like nanosleep() or gethostname(); you won’t have to check for them.

Now of course to support legacy systems you cannot relay on these standards, that are pretty new and not well supported, if at all, by older versions of compilers and operating systems that you might be interested in supporting. Well, guess what? A good way to deal with this is, rather than checking everything with autotools, and then dealing with all the issues one by one, is to assume things are available and give legacy operating system a chance to run this by having a library to supply the missing parts. Such a library can implement replacement or stubs for the missing functions, and headers; then the users of the legacy operating systems might just provide the library as extra to the project itself.

If you don’t like this approach, which in my opinion is quite nice and clear, you can rely fully on an external library instead, such as glib, to provide you some platform-independent support (like integer named types, byteswap macros, string functions). Again this reduces to requiring things to be available, rather than adapting (maybe too much) the software to the platforms it supports.

Sometimes, these approaches can be a bit an overkill though, since you might not have the need for the full C99, but you can accept C89 just fine, with a few touches; for this reason you might just assume that your functions are present, but they might not be using the exact name you expect (for instance there are a few functions that change name between POSIX and Windows), or you might be wanting to look at the function name for a known replacement function present in an extension library (that might as well be glib itself!).

In these cases you can well rely on the checks that come from autotools, but even then, I’d suggest you to be careful with what you write. One common problem for instance is overchecking headers. Let’s say you have to find at least one header that declares standard integer types (int64_t and similar); in general you can expect that one of inttypes.h, stdint.h or sys/types.h will be present and will have the definitions you need. The simplest test to find them is to use something like this:

dnl in configure.ac
AC_CHECK_HEADERS([inttypes.h stdint.h sys/types.h])

/* in a common header file */
#if defined(HAVE_INTTYPES_H)
# include 
#elif defined(HAVE_STDINT_H)
# include 
#elif defined(HAVE_SYS_TYPES_H)
# include 
#else
# error "Missing standard integer types"
#endif

While the code in the header is quite good, since only one of the found types is used, the example code in configure.ac is overchecking, since it’s checking all three of them, even if just the first hit is used. Indeed, if you check the output of a configure script using that, you’ll get this:

..snip..
checking for inttypes.h... (cached) yes
checking for stdint.h... (cached) yes
checking for sys/types.h... (cached) yes
..snip..

(the fact that the test is cached is because autoconf already checks for them, that’s overchecking in autoconf itself, and should probably be fixed).

Instead of the former code, a slightly different variant can be used:

AC_CHECK_HEADERS([inttypes.h stdint.h sys/types.h], [break;])

With this, the checks will stop at the first header that has been found. It might not sound much of a difference but if you pile them up, well it’s the usual point, little drops make a sea. This gets particularly useful when packages rename their include files, or they decide they want to move from just the basename to packagename/basename approach (think FFmpeg); you can test for the new one, and if that doesn’t hit, check the old one, but avoid checking twice if the new one hit already.

The same approach can be used with AC_CHECK_FUNCS so that you only check for the first function of a series of possible replacements, and go with that one.

But the most important thing is to make sure that all the checks you run during configure are actually useful and used. It’s not uncommon for software to check for a series of headers or functions to define the HAVE macros, but never use the macro themselves. It’s tremendously unfortunate and it should be avoided at all costs, you should always check your software for that, especially when you make changes that might make checks pointless.

Do you maintain an autotools-based software? Then please take a look at your configure.ac and make sure you’re not running pointless checks, all the users will be happy if you can reduce their build time!