Compilers’ rant

Be warned that this blog’s style is in form of a rant, because I’ve spent the past twelve hours fighting with multiple compilers trying to make sense of them while trying to get the best out of my unpaper fork thanks to the different analysis.

Let’s start with a few more notes about the Pathscale compiler I already slightly ranted about for my work on Ruby-Elf. I know I didn’t do the right thing when I posted that stuff as I should have reported the issues upstream directly, but I didn’t have much time, I was already swamped with other tasks, and going through a very bad personal moment, so I quickly written up my feelings without doing proper reports. I have to thank Pathscale people for accepting the critiques anyway, as Måns reported me that a couple of the issues I noted, in particular the use of --as-needed and the __PIC__ definition were taken care of (sorta, see in a moment).

First problem with the Pathscale compiler: by mistake I have been using the C++ compiler to compile C code; rather than screaming at me, it passed through properly, with one little difference: a static constant gets mis-emitted and this is not a minor issue at all, even though I am using the wrong compiler! Instead of having the right content, the constant is emitted as an empty, zeroed-out array of characters of the right size. I only noticed because of Ruby-elf’s cowstats reporting what should have been a constant into the .bss section. This is probably the most worrisome bug I have seen with Pathscale yet!

Of course its impact is theoretically limited by the fact that I was using the wrong compiler, but since the code should be written in a way to be both valid C and C+, I’m afraid the same bug might exist for some properly-written C+ code.. I hope it might get fixed soon.

The killer feature for Pathscale’s compiler is supposedly optimisation, though, and in that it looks like it is doing quite a nice job, indeed I can see from the emitted assembler that it is finding more semantics to the code than GCC seems to, even though it requires -O3 -march=barcelona to make something useful out of it — and in that case you give up debugging information as the debug sections may reference symbols that were dropped, and the linker will be unable to produce a final executable. This is hit and miss of course, as it depends on whether the optimiser will drop those symbols, but it makes difficult to use -ggdb at all in these cases.

Speaking about optimisations, as I said in my other post, GCC’s missed optimisation is still missed by Pathscale even with full optimisation (-O3) turned on, and with the latest sources. And is also still not fixed the wrong placement of static constants that I ranted about in that post.

Finally, for what concerns the __PIC__ definition that Måns referred as being fixed, well, it isn’t really as fixed as one would expect. Yes, using -fPIC now implies defining __PIC__ and __pic__ as GCC does, but there are two more issues:

* While this does not apply to x86 and amd64 (but just for m68k, PowerPC and Sparc), GCC supports two modes for emission of position-independent code, one that is limited by the architecture’s global offset table maximum size, and the other that overrides such maximum size (I never investigated how it does that, probably through some indirect tables). The two options are enabled through -fpic (or -fpie) and -fPIC (-fPIE) and define the macros as 1 and 2, respectively; Path64 does only ever define them to 1.
* With GCC, using
-fPIE@ – that is used to emit Position Independent Executables – or the alternative -fpie of course, implies the use of -fPIC, which in turn means that the two macros noted above are defined; at the same time, two more are defined, __pie__ and __PIE__ with the same values as described in the previous paragraph. Path64 defines none of these four macros when building PIE.

But enough rant about Pathscale, before they feel I’m singling them out (which I’m not). Let’s rant a bit about Clang as well, the only compiler up to now that properly dropped write-only unit-static variables. I had very high expectations for what concerns improving unpaper through its suggestions but.. it turns out it cannot really create any executable, at least that’s what autoconf tells me:

configure:2534: clang -O2 -ggdb -Wall -Wextra -pipe -v   conftest.c  >&5
clang version 2.9 (tags/RELEASE_29/final)
Target: x86_64-pc-linux-gnu
Thread model: posix
 "/usr/bin/clang" -cc1 -triple x86_64-pc-linux-gnu -emit-obj -disable-free -disable-llvm-verifier -main-file-name conftest.c -mrelocation-model static -mdisable-fp-elim -masm-verbose -mconstructor-aliases -munwind-tables -target-cpu x86-64 -target-linker-version 2.21.53.0.2.20110804 -momit-leaf-frame-pointer -v -g -resource-dir /usr/bin/../lib/clang/2.9 -O2 -Wall -Wextra -ferror-limit 19 -fmessage-length 0 -fgnu-runtime -fdiagnostics-show-option -o /tmp/cc-N4cHx6.o -x c conftest.c
clang -cc1 version 2.9 based upon llvm 2.9 hosted on x86_64-pc-linux-gnu
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/include
 /usr/bin/../lib/clang/2.9/include
 /usr/include
 /usr/lib/gcc/x86_64-pc-linux-gnu/4.6.1/include
End of search list.
 "/usr/bin/ld" --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o a.out /usr/lib/../lib64/crt1.o /usr/lib/../lib64/crti.o crtbegin.o -L -L/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/../../.. /tmp/cc-N4cHx6.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed crtend.o /usr/lib/../lib64/crtn.o
/usr/bin/ld: cannot find crtbegin.o: No such file or directory
/usr/bin/ld: cannot find -lgcc
/usr/bin/ld: cannot find -lgcc_s
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:2538: $? = 1
configure:2576: result: no

What’s going on? Well, Clang doesn’t provide its own crtbegin.o file for the C runtime prologue (while Path64 does), so it relies on the one provided by GCC, which has to be on the system somewhere. Unfortunately, to identify where this file is… they try hitting and missing.

% strace -e stat clang test.c -o test |& grep crtbegin.o
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.2/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.5.2/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.1/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.5.1/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.5/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.5/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.4.5/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.4.5/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.4.4/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.4.4/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.4.3/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.4.3/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.4/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.4/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.3.4/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.3.4/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.3.3/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.3.3/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.3.2/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.3.2/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.3/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.3/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.2.4/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.2.4/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.2.3/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.2.3/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.2.2/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.2.2/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.2.1/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.2.1/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.2/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.2/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/crtbegin.o", 0x7fffc937f170)     = -1 ENOENT (No such file or directory)
stat("/../../../../lib64/crtbegin.o", 0x7fffc937f170) = -1 ENOENT (No such file or directory)
stat("/lib/../lib64/crtbegin.o", 0x7fffc937f170) = -1 ENOENT (No such file or directory)
stat("/usr/lib/../lib64/crtbegin.o", 0x7fffc937f170) = -1 ENOENT (No such file or directory)
stat("/../../../crtbegin.o", 0x7fffc937f170) = -1 ENOENT (No such file or directory)

Yes you can see that it has a hardcoded list of GCC versions that it looks for, from higher to lower, until it falls back to some generic paths (which don’t make that much sense to me to be honest, but nevermind). This means that on my system, that only has GCC 4.6.1 installed, you can’t use clang. This was reported last week and while a patch is available, a real solution is still not there: we shouldn’t be patching and bumping clang each time a new micro version of GCC is released that upstream didn’t list already!

Sigh. While GCC sure has its shortcomings, this is not really looking promising either.

3 thoughts on “Compilers’ rant

  1. I stumbled across this issue while trying to code stuff for a summer job that I took at a start-up. Since I was relying on Clang solely for correctness testing, I didn’t have any reason to have Clang build actual binaries until a major change I made to my code broke g++.I wrote a patch for Clang rather quickly after using Google to learn about the issue. Unfortunately, I realized when I was going to file a bug report that one already existed, but mine covered more GCC versions, so I posted it to the existing report. I also realized that writing software for Gentoo that relied on Clang would be difficult for the next developer to maintain.I spent the next few hours working out how to make g++ happy, which basically required that I change protected inheritance to public inheritance. Apparently, if you have a bunch of classes inherit from each other (C inherits from B, which inherits from A) and they use templates, you can’t use protected inheritance to be able to access member functions of the parent classes in GCC.With that said, I do not expect Clang to be a viable production compiler on Gentoo until a few things happen. One, the Clang ebuild needs to stop relying on GCC-specific components and instead rely on substitutes (e.g. libc++). Second, Clang needs to be keyworded stable, which is needed for a production environment like the system for which I am writing code. Three, Gentoo needs to be modified so that ebuilds call generic cc, c++, etcetera commands and users are able to eselect system compilers for them.

    Like

  2. Richard, Clang is by itself probably not ready to be the production compiler for a long list of reasons, the same goes for path64.On the other hand, you’re wrong on the third point: Gentoo should _not_ use @cc@ and @c++@ at all! Instead we’re doing the best we can to use @$CC@ and @$CXX@, so they can be configured in @make.conf@ just the same as the flags.Unfortunately, though, a number of packages assume gcc flags are available, and those are rarely supported by other compilers.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s