Boosting my morale? I wish!

Let’s take a deep breath. You probably remember I’m running a tinderbox which is testing some common base system packages before they are unmasked (and thus unleashed on users); in particular I use it for testing new releases of GCC (4.7) and GLIBC (2.16).

It didn’t take me much after starting GLIBC 2.16 testing to find out that the previously-latest version of Boost (1.49) was not going to work with it. The problem is that there is a new definition that both of them tried to provide, TIME_UTC (probably relates to C++11/C11). Now unfortunately since it’s an API breakage to replace that definition, it means that it can’t be applied to the older versions, and it means that packages need to be fixed. Furthermore, the new 1.50 version has also broken the temporary compatibility introduced in 1.48 for their filesystem module’s API. This boils down to a world of pain for maintainers of packages using Boost (which includes yours truly, luckily none is directly maintained by me, just proxy).

So I had to add one extra package to the list, and ran the reverse dependencies — the positive side is that it didn’t take long to fill the bug although there are still a few issues with older boost versions not being supported yet. This brought up a few issues though…

First problem is the way Boost build itself, and its tests, is obnoxious: it’s totally serial, no parallelisation at all! The result is that to run the whole testsuite it takes over eight hours on Excelsior! The big issue is that for the testing, it takes some 10-20 times longer to build the test than run it (lovely language, C++), so a parallel build of the tests, even if the tests were executed in series, would mean a huge impact, and would also likely mean that the tests would become meaningful. As they sit, the (so-called) maintainer of the package has admitted to not run them when he revbumps, but only on full new versions.

The second problem is how Boost versions are discovered. The main issue is that Boost, instead of using proper sonames to keep binary compatibility, embeds its major/minor version pair in the library name — although most distributions symlinks the preferred version to the unversioned name (in Gentoo this is handled through the eselect boost tool). This is not extremely far from what most distributions do with Berkeley DB — but it causes problem when you have to find which one you should link to, especially when you consider that sometimes the unversioned name is not there at all.

So both CMake and Autotools (actually Autoconf Archive) provide macros to try a few different libraries. The former does it almost properly, starting from the highest version and then going in descending order — but uses a pre-defined list of versions to try! Which mean that most packages with CMake will try 1.49 first, as they don’t know that 1.50 is out yet! If no known version is found, then it will fallback to the unversioned library, which makes it work quite differently whether you have only one or more than one version installed!

For what concerns the macros from Autoconf Archive, they are quite nasty; from one side they really aren’t portable at all, as they use GNU sed syntax, they use uname (which makes them totally useless during cross-compilation), but most worrisome of all, is that they use ls to find which boost libraries are available and then take the first one that is usable. This means that if you have 1.50, 1.49 and 1.44 installed, it’ll use the oldest! Similarly to CMake, it uses the unversioned library last. In this case, though, I was able to improve the macros by reversing the check order, which makes them work correctly for most distributions out there.

What is even funnier about the AX macros (that were created for libtorrent, and are used by Gource, which I proxy-maintain for Enrico), is that due to the way they are implemented, it is feasible that they end up using the old libraries and the new headers (it was the case for me here with 1.491.50, as it didn’t fail to compile, just to link). As long as the interface used have different names and the linker will error out, all is fine. But if you have interfaces that are source-compatible, linker-compatible, but with different vtables, you have a crash waiting to happen.

Oh well…

Why autoconf updates are never really clean

I’m sure a lot of users have noticed that each time autoconf is updated, a helluva lot of packages fail to build for a while. This is probably one of the reasons why lots of people dislike autotools in the first place. I would like to let people know that it’s not entirely autoconf’s fault if that happens, and actually, it’s often not autoconf’s fault at all!

I have already written one post about phantom macros due to recent changes but that wasn’t really anybody’s fault in the sense that the semantic of the two macros changed, with warning, between autoconf versions. On the other hand, I also have ranted, mostly on, about the way KDE 3, in its latest version, is badly broken by the update. Since the problem with KDE3 is far from being isolated, I’d rant a bit more about it here and try to explain why it’s a bad thing.

First of all let me describe the problem with KDE3 so that you understand I’m not coming up with stuff just to badmouth them. I have already written in the past, ranted and so on, about the fact that KDE3’s build system was not autotools, but it was rather autotools-based. Indeed, the admin/ subdirectory that is used by almost all the KDE3 packages is a KDE invention; the files as well. Unfortunately it doesn’t look like the KDE developers learnt anything about it before, and they seem to be doing something very similar with CMake as well. I feel sorry for our KDE team now.

Now, of course there has been reasons why KDE created such a build system that reminds me of Frankenstein: from one side, they needed to wire up support for Qt’s uic and moc tools; from the other, they wanted the sub-module monolithic setup that is sought after by a limited amount of binary distributions and hated to the guts by almost all the source-based distributions.

I started hating this idea for two separates reasons: the first is that we couldn’t update automake: only 1.9 works, and we’re now at 1.11, the new ones changed enough behaviour that there is no chance the custom code works; the second reason is that the generated configure files were overly long, checking the same things over and over, and in very slow ways (compiling rather than using pkg-config). One of the tests I always found braindead was the check, done by every KDE3-based package, on whether libqt-mt required libjpeg at link time: a workaround for a known broken libqt-mt library!

Now, with autoconf 2.64, the whole build system broke down. Why’s that? Very simple: if you try to rebuild the autotools for kdelibs, like Gentoo does, you end up with a failure because the macro AH_CHECK_HEADERS is not found. That macro has been renamed in 2.64 to _AH_CHECK_HEADERS since it’s an internal macro, not something that configure scripts should be using directly. Indeed this macro call is in the KDE-custom KDE_CHECK_HEADERS that seems to deal with C and C++ language differences in the checks for headers (funnily this wasn’t enough to avoid language mistakes ). This wouldn’t be bad, extending macros is what autoconf is all about; but using internal headers to do that, is really a mistake.

Now, if it was just KDE making this mistake, the problem would be solved already: KDE 4 migrated to CMake, so in the worst case it would fail when the next CMake change is done that breaks their CMake-based build system, and KDE 3 is going away, which will solve the autoconf 2.64 problem altogether. Unfortunately, KDE was not the only project making this mistake; even worse, projects with exactly one package made this mistake, and that’s quite a bit of a problem.

When you have to maintain a build system for a whole set of packages, like KDE has to, mistakes like the one of using internal macros are somewhat to be expected and shouldn’t be considered strange or out of place. When you do that for a single package, then you really should stop from writing build systems, since you’re definitely overcomplicating things without any good reason.

Some of the signs that your build system is overcomplicated:

  • it still looks like it was generated by autoscan; you have not removed any of the checks added by that, nor you have added conditionals in the code to act upon those checks;
  • you’re doing lots of special conditioning over the host and target definitions; you don’t even try to find the stuff in the system but decide it’s there or it’s not by checking the host; that’s not the autoconf way;
  • you replace all the standard autoconf macros with your own, having NIH_PROG_CC for instance; you are trying to be smarter than autoconf, but you most likely are not.

Autotools Come Home

This article was originally published on the Axant Technical Blog.

With our experience as Gentoo developers, me and Luca have had experience with a wide range of build systems; while there are obvious degrees of goodness/badness in build system worlds, we express our preference for autotools over most of the custom build systems, and especially over cmake-based build systems, that seem to be high on the tide thanks to KDE in the last two years.

I have recently written my views on build systems: in which I explain why I dislike CMake and why I don’t mind it when it is replacing a very-custom bad build system. The one reason I gave for using CMake is the lack of support for Microsoft Visual C++ Compiler, which is needed by some type of projects under Windows (GCC still lacks way too many features); this starts to become a moot point.

Indeed if you look at the NEWS file for the latest release (unleashed yesterday) 1.11, there is this note:

  • The depcomp' andcompile’ scripts now work with MSVC under MSYS.

This means that when running configure scripts under MSYS (which means having most of the POSIX/GNU tools available under the Windows terminal prompt), it’s possible to use the Microsoft compiler, thanks to the compile wrapper script. Of course this does not mean the features are on par with CMake yet, mostly because all the configure scripts I’ve seen up to now seem to expect GCC or compatible compilers, which means that it will require for more complex tests, and especially macro archives, to replace the Visual Studio project files. Also, CMake having a fairly standard way to handle options and extra dependencies, can have a GUI to select those, where autotools are still tremendously fragmented in that regard.

Additionally, one of the most-recreated and probably useless features, the Linux-style quiet-but-not-entirely build, is now implemented directly in automake through the silent-make option. While I don’t see much point in calling that a killer feature I’m sure there are people who are interested in seeing that.

While many people seem to think that autotools are dead and that they should disappear, there is actually fairly active development behind them, and the whole thing is likely going to progress and improve over the next months. Maybe I should find the time to try making the compile wrapper script work with Borland’s compiler too, of which I have a license; it would be one feature that CMake is missing.

At any rate, I’ll probably extend my autotools guide for automake 1.11, together with a few extras, in the next few days. And maybe I can continue my Autotools Mythbuster series that I’ve been writing on my blog for a while.

The long-awaited build systems post (part 1, maybe)

I call this “long-awaited” because I promised Donnie to write about this a few months ago, but I haven’t had time to work on it in quite a while. Since today it should be an holiday, yet I’m still working on a few things, I’m going to try filling the hole.

I have been criticising CMake quite a bit on my blog, and I have been trying with all my strength to show that autotools aren’t as bad as they are made appear since I think that is still the best build system framework we have available at the moment. But in all this, I’m afraid the wrong message was sent somehow, so I want to clear it up a bit.

First of all, CMake isn’t as bad as, say, scons or imake; it’s also not as bad as qmake under certain circumstances. I don’t think that CMake is bad in absolute; I just think it’s bad as an “universal” solution. Which I have to admit autotools are bad for too, to an extent. So let me explain what my problem is.

For Free Software for a long time the autotools framework was the almost de-facto standard for lots of packages; switching away from that to CMake just because “it’s cool” or it seems easier (it does seem easier, mostly because the autotools documentation and examples are full of particularly bad code), it’s not the right thing to do since it increases the work for packages, especially because for some things CMake isn’t yet particularly polished.

I blame KDE for the CMake tide not much because I don’t think it was the right choice for them, but rather because they seem to pinpoint the reasons to the change to autotools defects when they are, actually, making a pragmatic choice for a completely different reason: supporting Microsoft Visual C++ compiler. As I have already expressed more than a couple of time, the autotools-based buildsystem in KDE 3 is a real mess, a bastardised version of autotools. Blaming autotools in general for that mess is uncalled for.

It’s also not only about Windows support; you can build software under Windows with just autotools, if you use cygwin or msys with GCC, what you cannot do is building with Microsoft’s compiler. Since GCC objectively still lacks some features needed or highly desired under Windows, I can understand that some projects do need Microsoft’s compiler to work. I’m not sure how true that is for KDE, but that’s their choice to desire Microsoft’s compiler. And CMake allows them to do that. More power to them.

But in general, I’m very happy if a project whose build system is custom-made, or based on scons or imake, gets ported to CMake, even if not autotools, since that would mean having a somewhat standard build system, which is still better than the other options. And I’m totally fine if a project like libarchive gets dual build system to build on Unix and Windows with the “best” build system framework available on those systems.

Still, I think CMake has a few weak spots that should be taken care of sooner rather than later, and which are shared with autotools (which is what I usually point out when people say that it’s always and only better than autotools; while it’s actually making similar mistakes).

The first is the fact that they seem to have moved (or people claim they moved) from an actual build system solution to a “framework to build build systems” which is more or less what autoconf basically can be said to be and what scons always have been. This is particularly bad because it ensures that there is no standard way to build a package without actually having to check the definition files for that particular release: scons provides no standard options for flags handling, feature switching and similar; autotools can be messed up since different packages will use the same variable names for different meanings. If CMake were to provide just a framework, it’s have the same exact problem. I think somewhat this was supposed to be limited, from what I read when I tried to learn CMake, but the results now don’t seem to be as good.

The second problem is slightly tied to the one above, and relates to the “macro hell”. One of the worse issues with autoconf is that beside the set of macros provided with autotools itself, there are basically no standard macros. Sure there is the Autoconf Macro archive but even I fail at using it (had some problems before with the license handling, should probably try to use it again), and the result is that you end up copying forking and modifying the same macros over a series of projects. Some of the macros I wrote for xine are now used in lscube and also in PulseAudio .

CMake provides a set of modules to identify the presence of various libraries and other software packages; but instead of using it as an official repository for these macros, I’ve been told that they are “just examples” of how to write them. And some of them are very bad examples. I already ranted about the way the FindRuby module was broken and the issue was ignored until a KDE developer didn’t submit his version. Unfortunately there are still modules that are just as broken. The CMake developers should really look into avoiding the “macro hell” problem of autotools by integrating the idea of a macro archive with CMake itself, maybe having an official “CMake modules” package to install to the side to provide the package search macros, which can be updated independently from the actual CMake package.

I have lots of reserves about CMake, but I still think it’s a better alternative than many others. I also have quite a few problems with autotools, and I know they are far from perfect. Both build systems have their area of usefulness, I don’t think either can be an absolute replacement for the other, and that is probably my only problem with all this fuzz over CMake. Also, I do have an idea of what kind of buildsystem framework could hopefully replace both of them, and many other, but I still haven’t found anything that comes near it; I’ll leave that description for part two of this post, if I can find time.

Sorry Sput, but Quassel has to go from my systems

I’m going to get rid of Quassel in the next days unless something drastically changes, but since I really think that Sput was doing a hell of a good job, I’d like to point out what the problems are in my opinion.

There’s nothing wrong with the idea (I love it) nor with the UI (it’s not bad at all); having it be cross-platform also helps a lot. What I really feel is a problem, though, is the creeping in of dependencies in it. Which is not Sput’s fault for the most part, but it is a good example of why I think Qt and KDE development is getting farther and farther from what I liked about it in the past.

With KDE, the last straw was when I’ve noticed that to install Umbrello I had to install Akonadi, which in turn required me to install MySQL. I don’t use MySQL for myself, I used for a couple of web development jobs but I’d really like for it to stay stopped since I don’t need it on a daily basis. On the other hand I have a running PostgreSQL I use for my actual work, like the symbol collision analysis. I doubt that it would have required me to start MySQL or Akonadi to run Umbrello, but the problem was with the build system. Just like KDE guys bastardised autotools in what is one of the most overcomplex build systems that man was able to create in the KDE 3 series, they have made CMake even worse than it would be as released by Kitware (which, on the other hand, somehow seemed to make it a bit less obnoxious — not that I like it any better, but if one has the major need of building under Windows, it can be dealt with better than some custom build systems I’ve seen).

So the new KDE4 build system seems to pick up the concept of shared checks from KDE3, which basically turns down to be a huge amount of checks that are unneeded for most software but will be executed by all of it, just because trying to actually split the “modules” in per-application releases, like GNOME does already, is just too difficult for SuSE, sorry, KDE developers.

This time the dependency creep hit Quassel badly. The recent releases of Quassel added a dependency over qt-webkit to show a preview of a link when posted in IRC. While I think this is a bad idea (because, for instance, if there was a security issue in qt-webkit, it would be tremendously easy to get users to load the page), and it still has implementation issues when the link points to a big binary file rather than a webpage or an image, it can be considered an useful feature so I never complained about it.

Today after setting up the new disks the update proposed by portage contained an interesting request of installing qt-phonon. Which I don’t intend to install at all! The whole idea of having to install phonon for an application like Quassel is just out of my range of acceptable doings.

I was the first one to complain that GNOME required/requires GStreamer, but thanks to Lennart’s efforts we now have an easy way to play system sound without needing GStreamer, on the other hand, KDE is still sticking with huge amount of layers and complex engines to do the easiest of the tasks. I’m not saying that the ideas behind Solid and the like are entirely wrong, but it does feel wrong for them to be KDE-only, just like it feels wrong for other technologies to be GNOME-only. Lennart’s libcanberra shows that there is space for desktop-agnostic technologies implementing the basic features needed by all of them, it just requires work and coordination.

So now I’m starting up Quassel to check on my messages and then I’ll log it out, after installing X-Chat or something.

Prank calls and threats

Disclaimer: please take this post with a grain of salt and consider it tongue-in-cheek. While the situation from which it sprouts is very real, the rest is written with a jokingly tone. In particular I wish to state beforehand that I’m not trying to tie anything to Ciaran and his group, or any other group for what matters.

As it turns out, last night some very funny guy thought that it was a nice idea to call me and tell me I’ll die, obviously with the caller ID withheld. It’s almost certainly a prank call (as a good rationalist, it’s 99.98% a prank call, 0.02% you never know about), but with the cold and the meds in me, I didn’t have the quickness of response to say “you go first and don’t spoilt it to me how it is”.

Just to cover all basis, I’m now considering who might actually want me dead. Which turns out that, if we consider Hans Reiser’s extreme personality cases, might be quite a bit of people. I wouldn’t count Ciaran in the list though, since a) I respect him enough to trust he wouldn’t do it anonymously if he wanted to and b) Stephen is more the person to slander rather than threaten. Beside, that area has been quiet for quite a bit of time that I almost forgot about it.

Last time I was threatened was the time of XMMS removal and it was more than two years ago by now. I don’t think this is related to that at all. But staying on the multimedia side of the fence, I can see a possible issue in people disliking PulseAudio with no good reason (the link is for a positive post); but even though I do lend a hand to Lennart with autotools, I sincerely doubt that my involvement is enough for people to want to get rid of me just for that.

It could have been some anti-Apple activist gone crazy for my previous post praising some of Apple’s products, I guess I should have started with a list of things I don’t like about Apple, or with a list of things that I do for Free Software each day and which is not going to stop just because I can settle for now with Apple products. But there are more chances that if somebody wants me dead is for dissing some projects he likes, it’s not like I didn’t criticise quite a few before, like cmake.

But if we expect this to be tied to something that happened recently, I shouldn’t rule out my criticism of Ruby 1.9 as well as my not-so-recent move from KDE to GNOME (I have to say, why if I move from KDE to GNOME, and I have been a KDE developer, it does not even make a news site, and if Linus does he gets on LWN ? I dare to say this is unjust!). These sound more likely for crazy guys just because they might feel “betrayed” since I was on their side before and then turned away, while for what concerns XMMS, PulseAudio, Apple and CMake I haven’t changed (much) my opinion.

Another option, if we follow what the mass-media has shown of black-hat hackers (even outside our general Western culture, is that somebody got upset about my recent security-oriented work, either because I found some security issue that they tried to hide around. Or it’s another security analyst that is upset because I found so many issues all at once.

All in all, I guess I could have enough reasons to worry if enough FOSS people suffer from the Reiser syndrome. Hopefully this is not the case. The good news is that nobody left me threats on the blog or via e-mail so I really don’t think I have to worry about FOSS people. And please if you think it would be fun to leave them now, just don’t, okay?

Building the whole portage

Since my blog post about forced --as-needed yesterday, I started building the whole portage in a chroot to see how many packages break with the forced --as-needed at build time. While build-time failure are not the only problem here (for stuff like Python, Perl and Ruby packages, the failure may well be at runtime), build-time failure are probably the most common problem with --as-needed; if we also used --no-undefined, like Robert Wohlrab is trying (maybe a little too enthusiastically), most of the failures would be at build time by the way.

As usual, testing one case also adds tests for other side cases, this time I’ve added further checks to tell me if packages install files in /usr/man, /usr/info, /usr/locale, /usr/doc, /usr/X11R6, …and filed quite a few bugs about that already. But even without counting these problems, the run started telling me some interesting thing that I might add to the --as-needed fixing guide when I get back to work on it (maybe even this very evening).

I already knew that most of the failures I’d be receiving would be related to packages that lack a proper buildsystem and thus ignored LDFLAGS up to now (included --as-needed), but there are a few notes that really gets interested here: custom ./configure scripts seem to almost always ignore LDFLAGS and yet fail to properly link packages; a few glib-based packages fail to link to libgthread the main executable, failing to find g_thread_init(); and a lot of packages link the wrong OpenSSL library (they link libssl when they should link libcrypto).

This last note, about OpenSSL libraries, is also a very nice and useful example to show how --as-needed helps users in two main ways. Let’s go over a scenario where a package links in libssl instead of libcrypto (since libssl requires libcrypto, the symbols are satisfied, if the link is done without --as-needed).

First point: ABI changes: if libssl changed its ABI (happens sometimes, you know…), but libcrypto kept the same, the program would require an useless rebuild: it’s not affected by libssl ABI, but by libcrypto’s.

Second point, maybe even more useful at runtime: when executing the program, the libraries in NEEDED are loaded, recursively. While libssl is not too big, it would still require loading one further library that is unneeded to the program since libcrypto is the one that is actually needed. I sincerely don’t know if libssl has any constructor functions, but when this extra load happens with libraries that have many more dependencies, or constructor functions, it’s going to be a quite huge hit for no good reason.

At any rate, I wish to thank again all the people who contributed to pay for Yamato, as you can see the horsepower in it is being put to good use (although I’m still just at app-portage); and just so I don’t have to stress it to do the same work over and over again, I can tell you that some of the checks I add to my chroots for building are being added to portage, thanks to Zac, who’s always blazing fast to add deeper QA and repoman warnings, so that further mistakes don’t creep into the new code.. one hopes, at least.

Oh and before people start expecting cmake to be perfect with --as-needed, since no package using it has been reported as failing with --as-needed … well the truth is that I can’t build any package using cmake in that chroot since it fails to build because of xmlrpc-c. And don’t get me started again on why a build system has to use XML-RPC without a chance for the user to tell “not even in your dreams”.

Good developers don’t necessarily create good build systems

it’s very common for users to expect that good developers will be good at any part of the development process, so if they wrote a good software that works, they expect the build system is just as good, and that the tests are good, and that the documentation should be good. Increasingly, the last part of this starts to be understood not to be the case; there is a huge amount of very good software that just lacks documentation (cough FFmpeg cough). In truth, none of these parts are always the case.

Being a good developer is a complex matter; one might be a very good coder, and thus write code that just works or that is very optimised, but write it in a way that is so obfuscated that nobody can deal with it but the original coder. One might be a very good interface designer, and write software that is perfectly user friendly, so that the users just love to work with it, but have a backend that just doesn’t do what it should. One might have very clear documentation, but the software is just too badly written to be useful.

Build systems are just the same, writing a proper buildsystem is a complex matter. With proper, I mean a buildsystem that allows to do just anything the user might expect to be able to do with a build system. This does not only mean building the software, but also installing, choosing a different installation prefix, cross-compiling, if it makes sense, and choosing build options so that optional features might or might not be compiled.

I have written many posts about how people try to replace autotools and in my opinion fail; I already express my discontent with CMake, since it replaces autotools with something just as complex, that only seems to be easier because KDE people are being instructed on how to use it (just as they were instructed to wrongly use autotools before). I already said that in my opinion the only reason for which CMake makes sense to be used by KDE is supporting the Microsoft C++ compiler under Windows. But at least CMake, after a not-so-long time, is shaping up as being an acceptable build system, I have to give credit to Kitware for that. SCons and similar failed in a cosmic way.

But what I think is worse is when the developers are not even trying to replace autotools, but just don’t seem to understand them at all. I have said recently that I’m going to look into ragel for both LScube and xine. Ragel is a very nice piece of software, its code is very complex and the results are very good. I’m sure the author is a very good developer. But it’s not a very good build system manager.

The build system used by ragel, both release and trunk, uses autoconf for some basic discovery and to allow the user to choose the prefix, but then it uses custom-tailored makefiles, as well as a custom-made config.h file. So what is the problem with them? Well there are many, some of which are being worked around in the ebuild. For instance, the install phase does not support a DESTDIR variable, which is fundamental for almost any distribution to install the package in a temporary directory before packaging it (or in the case of Gentoo, merging it on the live filesystem); it also uses the CFLAGS variable for compiling C++ sources, which is not nice at all either.

You can guess that all these problems are solved very nicely by using automake instead of custom-tailored makefiles. But it’s not just this; when you check out ragel from the Subversion repository, it needs to be bootstrapped; part of the sources are built starting from ragel itself, but to enable/disable the regeneration of these, it’s not just checking if they are missing or what, you have to edit the file to enable/disable checking for the tools needed (ragel, kelbt and gperf). Again, it’s not really a good idea.

I’ve sent a few patches to ragel-users to update the buildsystem; a few are to update the currently-used build system to at least support some basic features used by packagers, but then I was quite tired and decided to go with destroying the old buildsystem and replacing it with proper autotools. Why this? Because I found out that the old build system didn’t support out of tree builds either. And of course they were handling packaging manually, while autotools has a nice make dist and especially make distcheck target to take care of the source packaging step automatically.

I’m waiting to see if the maintainer(s) get in touch with me to see if they are intentioned to merge in the new buildsystem. I sincerely hope so, and if they are going to open up the repository of kelbt (hopefully with Git), I’d be glad to rework the buildsystem of that project too. Hopefully with the new buildsystems building ragel on OpenSolaris will be piece of cake (I’ll write more about what I’m doing with OpenSolaris in the next days).

One more reason not to trust CMake

So everybody says that CMake is great because it’s faster. Of course CMake achieves speed with an approach different from the one autotools have, that is, they don’t discover features, they apply knowledge. Hey it’s a valid method as any other, if you actually know what you are doing, and if you can keep up with variants and all the rest. Another thing that it does is to avoid the re-linking during the install phase.

Let me try to explain why re-linking exists: when you build a project using libtool, there might be binaries (executables and/or libraries) that depend on shared libraries that are being built in the same source tree. When you run the executables from the source tree, you want them to be used. When you install, as you might be installing just a subtree of the original software, libtool tries to guess if you just installed the library or not (often making mistakes) and if not, it re-links the target, that is, recreates it from scratch to link to the system library. In the case of packages built by ebuild, by the use of DESTDIR, we almost always have the relinking stage in there. Given that GNU ld is slow (and IMHO should be improved, rather than replaced by gold, but that’s material for another post), it’s a very wasteful process indeed, and libtool should be fixed not to perform that stage every time.

One of the things that the relinking stage is supposed to take care is to replace the rpath entries. An rpath entry specify to the runtime linker ( where to find the dependent libraries outside of the usual library paths (that is /etc/ and LD_LIBRARY_PATH). It’s used for non-standard install directories (for instance for internal libraries that should never be linked against) or during the in-tree execution of software, so that the just-built libraries are preferred over the ones in the system already.

So to make the install phase faster in CMake, they decided, with 2.6 series, to avoid the relinking, by messing with the rpath entries directly. It would be all fine and nice if they did it correctly of course.

I reported earlier a bug about cmake creating insecure runpaths in executables starting from version 2.6. Don’t panic, if you’re using Portage, it’s all fine, because the scanelf run that reports the problem also fixes it already. In that bug you can find a link to a discussion from April . The problem was known before 2.6.0 final was released, yet it was not addressed.

So it seems like someone (Alex) used chrpath first. That’s a good choice, there’s a tool out there that does what you need, use it. At the worse you use it wrong and you fix it fine.

But no, that’s not good enough for Kitware, of course, and Brad King decided to replace that with a built-in ELF parser (and editor). Guess what? Mr. King does not know ELF that well, and expected an empty rpath to behave like no rpath at all.

Try these simple commands:

% echo "echo Mr. King does not know ELF" > test-cmake-rpath
% chmod +x test-cmake-rpath
% PATH= test-cmake-rpath

An empty PATH adds the current working directory to it. Which means that the generated ELF files from CMake 2.6 would load any library that is in the current working directory that is named after one of the names in the NEEDED lines of itself and its dependencies. There are a few attack vectors exploiting those; not all of them are exactly easy to apply, most of them don’t cause root vulnerabilities but it’s still not good.

Now of course a mistake, or missing knowledge about a particular meaning of a value in an ELF file is nothing major. Myself I didn’t know about PATH= before a few months ago, but I did know an empty rpath was not good at least.

What is the problem then? The problem is that messing with an ELF attribute like rpath, without knowing ELF files, without knowing the behaviour of and even more importantly without asking to any of the QA team of any of the distributions out there (Gentoo is certainly not the only one who dislikes insecure rpath), is just not something that earns my trust. At all.

And even worse, if the original implementation used chrpath, why not leaving it at that? Given you don’t know enough about ELF files it sounds like a very good idea. It’s not like chrpath is a tremendously exotic tool to have around for distributions.

For your information, this is how chrpath behave, and how it’s difficult to actually misuse:

flame@enterprise mytmpfs % scanelf -r hellow*
ET_EXEC /tmp hellow 
ET_EXEC   -   hellow-2 
ET_EXEC  hellow-3 
flame@enterprise mytmpfs % scan
flame@enterprise mytmpfs % gcc -Wl,-rpath,/tmp hellow.c -o hellow 
flame@enterprise mytmpfs % scanelf -r hellow 
ET_EXEC /tmp hellow 
flame@enterprise mytmpfs % cp hellow hellow-2
flame@enterprise mytmpfs % chrpath -d hellow-2   
flame@enterprise mytmpfs % scanelf -r hellow-2
ET_EXEC   -   hellow-2 
flame@enterprise mytmpfs % cp hellow hellow-3
flame@enterprise mytmpfs % chrpath -r '' hellow-3
hellow-3: RPATH=/tmp
hellow-3: new RPATH: 
flame@enterprise mytmpfs % scanelf -r hellow-3
ET_EXEC  hellow-3 

And this is the easy fix:

flame@enterprise mytmpfs % scanelf -Xr hellow-3
ET_EXEC   -   hellow-3 

Okay now at the end of the day, what can we do about this problem? Well in Gentoo we should disable this behaviour from CMake, let’s make it a bit slower, but safer; even if scanelf is covering our butts, it’s still a patching up something that someone else is continuously screwing up; and it opens the vulnerability when the users build without Portage.

And indeed, if you are building something with CMake 2.6, outside of Portage, you might also want to fix the rpaths of the installed executables and libraries, by issuing scanelf -RXr $path_to_the_installed_tree. Possibly after each time you rebuild your stuff.

To finish, a nice note that shows just how much caring people handling CMake in KDE are …. KDE trunk will require CMake 2.6 on August 4th . Nevermind there is an open security issue related to the code it builds.

Oh the irony, and they say I don’t give enough arguments why I don’t like CMake!

I’m running Gnome

As it turns out, I start to dislike the way the KDE project is proceeding, and I don’t refer to the Gentoo KDE project, but to the whole of KDE project.

I dislike the way KDE 4 is being developed, with a focus on eyecandy rather than on features. This is easily shown by the Oxygen style; not only it is taking up a amount of screen real estate for widgets that remind me of Keramik (and if you remember, one thing that made happy a huge amount of users was the switch from Keramik to Plastik as default style in KDE 3.3), but it’s also tremendously slow. And I’m sure of this, it’s not just an impression: as soon as I switch Qt to use Oxygen, it takes five seconds for Quassel to draw the list of buffers; once I use QtCurve, it takes just one second. I don’t know if this is because Enterprise is using XAA and not EXA, but it certainly doesn’t look like something that the default theme should do.

And no, I’m not expected to use a computer that has less than an year, with a hyper-strength gaming videocard to be able to use KDE.

But this is just one of the issues I have with KDE recently. There are some policies I really, really, dislike in KDE. The first is one I already mentioned quite often and it’s the move to CMake. The only “good” reason to move to CMake is to be able to build under Windows using Microsoft’s Visual C++ compiler; yet instead of just saying “we needed cmake because it’s necessary to build for Windows” I see so many devs saying “cmake is just better than everything else out there”. Bullshit.

The other policy that I dislike regards the way KDE is developed and released as a single, huge, monolithic thing. One of the things that made KDE difficult to package in Gentoo (and other source-based distributions) was the fact that by default the source has to be built in those huge amorphous packages. And if the autotools-based build system of KDE sucked so much, it was also because of that.

But even if we leave alone the way the releases are made, it’s just not possible for everything to fall into a single release cycle kind of thing. There are projects that are more mature and projects that are less. Forcing all of them in a single release cycle makes it difficult to provide timely bugfixes for the mature projects, and makes it impossible for the not-so-mature projects to be tested incrementally. The last straw I could bear to see because of this stupid way of releasing, was knowing that Konversation in KDE 4 will probably lose the IRC-over-SSL support because KSSL was removed from the base libraries.

And now KDE 4.1 is on the verge of release, and Kopete still segfaults once you connect to Jabber. Yet when I tried (multiple times) to gather information about the possible cause in #kopete (so I could at least try to debug it myself), I had no feedback at all; maybe it’s because I run Gentoo, although the same happens on (K)Ubuntu. Yeah not the kind of people I like to deal with.

I’m not saying that I think Gnome is perfect for policies and other things. I dislike the fact that it’s always more Linux-/Solaris-centric than cross-platform centric; but I think KDE4 was a set back for that too, for what I read. And their release method does look a lot more sane.

I started using Linux with KDE 2. I moved to Gnome once KDE 3 was being taken care of. I came back to KDE just a bit before 3.3 release. Now I’m going to try Gnome for a while, and if I like it, I’ll think more than twice before going back to KDE. Yeah sure I liked KDE 3 better than I liked Gnome before that, but it’s just not feasible that I have to switch DE every time they want to make a new release.

Besides, since I last used it, Gnome seems much more mature and nicer to deal with.