Autotools Mythbuster: A practical case of TMT (Too Many Tests)

I have written already about the fact that using autoscan is simply a bad idea not too long ago. On the other hand, today I found one very practical case where using (an old version of) autoscan have create a very practical problem. While this post is not going to talk about the pambase problems there is a connection to my previous posts: it is related to the cross-compilation of Linux-PAM.

It is an unfortunately known problem that Linux-PAM ebuild fails cross-compilation — and there are a number of workaround that have never been applied in the ebuilds. The reason are relatively simple: I have insisted for someone else who had the issue at hand to send them upstream. Finally somebody did, and Thorsten fixed the issues with the famous latest release — or so is theory. A quick check shows me that the latest PAM version is also not working as intended.

Looking around the build log it seems like the problem is that the configure script for Linux-PAM, using the original AC_PROG_LEX macro, is unable to identify the correct (f)lex library to link against. Again, the problem is obvious: the cross-building wrappers that we provide in the crossdev package are causing dependencies present in DEPEND but not RDEPEND to be merged just on the root file system. Portage allows us to set the behaviour to merge them instead on the cross-root; but even that is wrong since we need flex to be present both as a (CBUILD) tool, and as a (CHOST) library. I’ve asked Zac to provide a “both” solution, we’ll see how we go from there.

Unfortunately a further problem happens when you try to cross-compile flex: it fails with undefined references to the rpl_malloc symbol. You can look it up and you’ll see that it’s definitely not a problem limited to flex. I know what I’m dealing with when finding these mistakes, but I guess it doesn’t hurt to explain them a bit further.

The rpl_malloc and rpl_realloc are two replacement symbols, #define’d during the configure phase by the AC_FUNC_MALLOC autoconf macro. They are used to replace the standard functions with two custom ones that can be conditioned; the fact that they are left dangling is, as I’ll show in a moment, pretty much a signal of overtesting.

Rather than simply checking for the presence of the malloc() function (can you really expect the function to be missing on any non-embedded system at all?), the AC_FUNC_MALLOC macro (together with its sibling AC_FUNC_REALLOC) checks for the presence of a glibc-compatible malloc() function. That “glibc-compatible” note simply means a function that will not return NULL when passed a length argument of 0 (which is the behaviour found in glibc and a number of other systems). It is a corner-case condition and most of the software I know is not relying at all on this happening, but sometimes it has been useful to test for, otherwise the macro wouldn’t be there.

Of course the sheer fact that nobody implemented the compatibility replacement functions in the flex source tree should make it safe to assume that there is no need for the behaviour to be present.

Looking at the original configure code really tells more of a story:

# Checks for typedefs, structures, and compiler characteristics.

AC_HEADER_STDBOOL
AC_C_CONST
AC_TYPE_SIZE_T

# Checks for library functions.

AC_FUNC_FORK
AC_FUNC_MALLOC
AC_FUNC_REALLOC
AC_CHECK_FUNCS([dup2 isascii memset pow regcomp setlocale strchr strtol])

The two comments are the usual ones that you’d find in a configure.in script created by autoscan; it’s also one of the few cases where you actually find a check for size_t, as most software assumes its presence anyway. More importantly, if you look at the long list of AC_CHECK_FUNCS macro call arguments, and then compare with the actual source code of the project, you can understand that the boilerplate checks are there, but their results are never checked for. That’s true for all the functions in there.

What do we make of it? Well, first of all, it shows that at least part of the autotools-based buildsystem in flex is not really well thought off (and you can add to that some quite idiotic stuff in the Makefile.am to express object file dependencies, which then required a patch that produced half-broken results). But it also shows that in all likeliness, the check for malloc() is there just because, and not because there is need for the malloc(0) behaviour.

A quick fix, consisting of removing useless checks, and rebuilding autotools with more modern autotools versions, and you have a perfectly cross-compilable, workaround-free flex ebuilds. Voilà!

So why am I blogging about this at all? Well, I’ll be updating the Guide later on today; in the mean time I wanted to let you know about this trickery because I sincerely have a problem with having to add tons of workaround for cross-building: it’s time to tackle them down, and quite a lot of those are related to this stupid macro usage, so if you find one, please remember this post and get to fix it properly!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s