Compilers’ rant

Be warned that this blog’s style is in form of a rant, because I’ve spent the past twelve hours fighting with multiple compilers trying to make sense of them while trying to get the best out of my unpaper fork thanks to the different analysis.

Let’s start with a few more notes about the Pathscale compiler I already slightly ranted about for my work on Ruby-Elf. I know I didn’t do the right thing when I posted that stuff as I should have reported the issues upstream directly, but I didn’t have much time, I was already swamped with other tasks, and going through a very bad personal moment, so I quickly written up my feelings without doing proper reports. I have to thank Pathscale people for accepting the critiques anyway, as Måns reported me that a couple of the issues I noted, in particular the use of --as-needed and the __PIC__ definition were taken care of (sorta, see in a moment).

First problem with the Pathscale compiler: by mistake I have been using the C++ compiler to compile C code; rather than screaming at me, it passed through properly, with one little difference: a static constant gets mis-emitted and this is not a minor issue at all, even though I am using the wrong compiler! Instead of having the right content, the constant is emitted as an empty, zeroed-out array of characters of the right size. I only noticed because of Ruby-elf’s cowstats reporting what should have been a constant into the .bss section. This is probably the most worrisome bug I have seen with Pathscale yet!

Of course its impact is theoretically limited by the fact that I was using the wrong compiler, but since the code should be written in a way to be both valid C and C+, I’m afraid the same bug might exist for some properly-written C+ code.. I hope it might get fixed soon.

The killer feature for Pathscale’s compiler is supposedly optimisation, though, and in that it looks like it is doing quite a nice job, indeed I can see from the emitted assembler that it is finding more semantics to the code than GCC seems to, even though it requires -O3 -march=barcelona to make something useful out of it — and in that case you give up debugging information as the debug sections may reference symbols that were dropped, and the linker will be unable to produce a final executable. This is hit and miss of course, as it depends on whether the optimiser will drop those symbols, but it makes difficult to use -ggdb at all in these cases.

Speaking about optimisations, as I said in my other post, GCC’s missed optimisation is still missed by Pathscale even with full optimisation (-O3) turned on, and with the latest sources. And is also still not fixed the wrong placement of static constants that I ranted about in that post.

Finally, for what concerns the __PIC__ definition that Måns referred as being fixed, well, it isn’t really as fixed as one would expect. Yes, using -fPIC now implies defining __PIC__ and __pic__ as GCC does, but there are two more issues:

  • While this does not apply to x86 and amd64 (but just for m68k, PowerPC and Sparc), GCC supports two modes for emission of position-independent code, one that is limited by the architecture’s global offset table maximum size, and the other that overrides such maximum size (I never investigated how it does that, probably through some indirect tables). The two options are enabled through -fpic (or -fpie) and -fPIC (-fPIE) and define the macros as 1 and 2, respectively; Path64 does only ever define them to 1.
  • With GCC, using -fPIE – that is used to emit Position Independent Executables – or the alternative -fpie of course, implies the use of -fPIC, which in turn means that the two macros noted above are defined; at the same time, two more are defined, __pie__ and __PIE__ with the same values as described in the previous paragraph. Path64 defines none of these four macros when building PIE.

But enough rant about Pathscale, before they feel I’m singling them out (which I’m not). Let’s rant a bit about Clang as well, the only compiler up to now that properly dropped write-only unit-static variables. I had very high expectations for what concerns improving unpaper through its suggestions but.. it turns out it cannot really create any executable, at least that’s what autoconf tells me:

configure:2534: clang -O2 -ggdb -Wall -Wextra -pipe -v   conftest.c  >&5
clang version 2.9 (tags/RELEASE_29/final)
Target: x86_64-pc-linux-gnu
Thread model: posix
 "/usr/bin/clang" -cc1 -triple x86_64-pc-linux-gnu -emit-obj -disable-free -disable-llvm-verifier -main-file-name conftest.c -mrelocation-model static -mdisable-fp-elim -masm-verbose -mconstructor-aliases -munwind-tables -target-cpu x86-64 -target-linker-version 2.21.53.0.2.20110804 -momit-leaf-frame-pointer -v -g -resource-dir /usr/bin/../lib/clang/2.9 -O2 -Wall -Wextra -ferror-limit 19 -fmessage-length 0 -fgnu-runtime -fdiagnostics-show-option -o /tmp/cc-N4cHx6.o -x c conftest.c
clang -cc1 version 2.9 based upon llvm 2.9 hosted on x86_64-pc-linux-gnu
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/include
 /usr/bin/../lib/clang/2.9/include
 /usr/include
 /usr/lib/gcc/x86_64-pc-linux-gnu/4.6.1/include
End of search list.
 "/usr/bin/ld" --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o a.out /usr/lib/../lib64/crt1.o /usr/lib/../lib64/crti.o crtbegin.o -L -L/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/../../.. /tmp/cc-N4cHx6.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed crtend.o /usr/lib/../lib64/crtn.o
/usr/bin/ld: cannot find crtbegin.o: No such file or directory
/usr/bin/ld: cannot find -lgcc
/usr/bin/ld: cannot find -lgcc_s
clang: error: linker command failed with exit code 1 (use -v to see invocation)
configure:2538: $? = 1
configure:2576: result: no

What’s going on? Well, Clang doesn’t provide its own crtbegin.o file for the C runtime prologue (while Path64 does), so it relies on the one provided by GCC, which has to be on the system somewhere. Unfortunately, to identify where this file is… they try hitting and missing.

% strace -e stat clang test.c -o test |& grep crtbegin.o
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.2/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.5.2/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.1/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.5.1/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.5/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.5/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.4.5/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.4.5/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.4.4/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.4.4/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.4.3/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.4.3/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.4/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.4/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.3.4/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.3.4/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.3.3/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.3.3/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.3.2/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.3.2/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.3/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.3/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.2.4/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.2.4/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.2.3/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.2.3/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.2.2/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.2.2/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.2.1/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.2.1/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib/gcc/x86_64-pc-linux-gnu/4.2/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/usr/lib64/gcc/x86_64-pc-linux-gnu/4.2/crtbegin.o", 0x7fffc937eff0) = -1 ENOENT (No such file or directory)
stat("/crtbegin.o", 0x7fffc937f170)     = -1 ENOENT (No such file or directory)
stat("/../../../../lib64/crtbegin.o", 0x7fffc937f170) = -1 ENOENT (No such file or directory)
stat("/lib/../lib64/crtbegin.o", 0x7fffc937f170) = -1 ENOENT (No such file or directory)
stat("/usr/lib/../lib64/crtbegin.o", 0x7fffc937f170) = -1 ENOENT (No such file or directory)
stat("/../../../crtbegin.o", 0x7fffc937f170) = -1 ENOENT (No such file or directory)

Yes you can see that it has a hardcoded list of GCC versions that it looks for, from higher to lower, until it falls back to some generic paths (which don’t make that much sense to me to be honest, but nevermind). This means that on my system, that only has GCC 4.6.1 installed, you can’t use clang. This was reported last week and while a patch is available, a real solution is still not there: we shouldn’t be patching and bumping clang each time a new micro version of GCC is released that upstream didn’t list already!

Sigh. While GCC sure has its shortcomings, this is not really looking promising either.

Apple’s HFS+, open-source tools, and LLVM

The title of this post seems a bit messed up, but it’ll make sense at the end. It’s half a recount of my personal hardware trouble and half a recount of my fighting with Apple’s software, and not of the kind my reader hate to read about I guess.

I recently had to tear apart my Seagate FreeAgent Xtreme external HDD. The reasons? Well, beside leaving me without a connection while using it (with Yamato) on eSATA, and forcing me to use either Firewire or USB (both much slower — and I did pay it to use eSATA!), yesterday it decided it didn’t want to let me access anything via either of the three connections, not even after a number of power cycles (waiting for it to cool down as well); this was probably related to the fact that I tried to use it again as eSATA, connected to the new laptop to try copying an already set-up partition from the local drive to make space for (sigh) Windows 7.

Luckily, there was no data worth spending time on, in that partition, just a few GNOME settings I could recreate in a matter of minutes anyway.

Since the Oxford Electronics-based bridge on the device decided not to help me out to get my data back, I decided to break it up, with the help of a Youtube video (don’t say that Youtube isn’t helpful!), and took the drive itself out, which is obviously a Seagate 7200.11 1TB drive, quite a sturdy one to look at it. No I won’t add it at the 7th disk drive to Yamato, mostly because I fear it wouldn’t be able to start up anymore if I did so.

Thankfully, I bought a Nilox-branded “bay” a month or so ago, when I gave away what remained of Enterprise to a friend of mine (the only task that Enterprise was still doing was saving data out of SATA disks when people brought me laptops or PCs that fried up. My choice for that bay was due to the fact that it allows you to plug in both 3.5” and 2.5” SATA disks without having to screw them anywhere. It does look a lot like something out of the Dollhouse set, to be honest, but that doesn’t matter now.

I plugged it in, and started downloading the data; I can’t be sure it is all fine, so I deleted lots and lots of stuff I won’t be safe about for a while. Then I shivered, fearing the disk itself was bad, and that I had no way to check it out… thankfully, the bay uses Sunplus electronics in it, and – lo and behold! – smartmontools has a driver for the Sunplus USB bridge! A SMART test later, and the disk turns out to feel better than any other disk I ever used. Wow. Well, it’s expected as I never compiled on it.

Anyway, what can I do with a 1TB SATA disk I cannot plug into any computer as it is? Well, actually one thing I can do: backup storage. Not the kind of rolling backup I’m currently doing with rsnapshot and the WD MyBook Studio II in eSATA (anything else is just too slow to backup virtual machines), but rather a fixed backup of stuff I don’t expect to be looking at or using anytime soon. But to be on the safe side, I wanted to have it available in a format I can access, on the go, with the Mac as well as from Linux; and vfat is obviously not a good choice.

The choice is, for the Nth time, HFS+. Since Apple has published quite a bit of specs on the matter, the support in Linux is decent, albeit far from being perfect (I still haven’t finished my NFS export patch, it does not support ACLs or extended attributes, and so on). It’s way too unreliable for rsnapshot (with hardlinking) but should work acceptably well for the storage.

The only reason I have not to use it for something I want to rely on, as it is, is that the tools for filesystem creationa nd check (mkfs and fsck) are quite a bit old. I’m not referring to “hfsutils” or “hfsplusutils” both of which are written from scratch and have a number of problems, including but not limited to, shitty 64-bit code. I’m referring to the diskdev_cmds package in Gentoo which is a straight port of Apple’s own code, which is released as FLOSS under the APSL2 license.

Yes, I call that FLOSS! You may hate Apple as much as you wish, but even FSF considers APSL2 a Free Software license albeit with problems; on the other hand they explicitly state this (emphasis mine):

For this reason, we recommend you do not release new software using this license; but it is ok to use and improve software which other people release under this license.

Anyway, I went to Apple’s releases for 10.6.3 software (interestingly they haven’t published yet 10.6.4 which was released just the other day), and downloaded diskdev_cmds, and the xnu package that contains their basic kernel interfaces, and I started working on an autotools build system to make it possible to easily port the code in the future (thanks to git and branching).

The first obstacle, beside the includes obviously changing, was that Apple decided to make good use of a feature they implemented as part of Snow Leopard’s “Grand Central Dispatch”, their “easy” multi-threading implementation (somewhat similar to the concept of OpenMP): “blocks”. Anonymous functions for the C language, an extension they worked in LLVM. So GCC straight is unable to build the new diskdev_cmds. I could either go to fetch an older diskdev_cmds tarball, from Leopard rather than Snow Leopard, where GCD was not implemented yet, or I could up the ante and try to get it working with some other tools. Guess what?

In Gentoo we already have LLVM around, and the clang frontend as well. I decided to write an Autoconf check for blocks support, and rely on clang for the build. Unfortunately it also needs Apple’s own libclosure, that provides some interfaces to work with blocks. And the basis for the GDC interface. It actually resonated a bit when Snow Leopard was presented because Apple released it for Windows as well, with the sources under MIT license (very liberal). Unfortunately you cannot find it in the page I linked above but you have to look at 10.6.2 page for whatever reason.

I first attempted to merge this straight in the diskdev_cmds sources, but then I guessed that it makes more sense to try porting it alone, and make it available, maybe somebody will find some good use for it. Unfortunately the task is not as trivial as it looks. The package needs two very simple functions for “atomic compare and swap” which OS X provides as part of its base library, and so does Windows. On Linux, equivalent functions are provided by HP’s libatomic_ops (you probably have it around because of PulseAudio).

Unfortunately, libatomic_ops does not build, as it is, with clang/LLVM; there is a mistake in the code, or the way it’s parsed; it’s not something unexpected given that inline assembler is a lot compiler-dependent. In this case it’s a size problem: it uses a constraint for integer types (32-bit) but a temporary (and same-sized input) of type unsigned character (8-bit). The second stop is again libatomic_ops’s problem: while it provides an equivalent interface to do atomic compare and swap for long types, it doesn’t do so for int types; that means it works fine on x86 (and other 32-bit architectures where both types are 32-bit) but it won’t do for x86-64 and other 64-bit architectures. Guess what the libclosure code needs?

Now of course it would be possible to lift the atomic operations out of the xnu code, or just write them straight, as libatomic_ops already provides them all, just not correctly-sized for x86-64 but the problem remains that you then have to add a number of functions for the various architecture rather than having a generic interface; xnu provides functions only for x86/x86-64 and PPC (since that’s what Apple uses/used).

And where has this left me now? Well, nowhere far, mostly with a sour feeling about libatomic_ops inability to provide a common, decent interface (for those who wonder, they do provide char-sized inlines for compare and swap for most architecture, and even the int-sized alternatives that I was longing for… but only for IA-64. You wouldn’t believe that until you remembered that the whole library is maintained by HP.

If I could take the time off without risking trouble, I would most likely try to get better HFS+ support in Linux, if only to make it easier and less troublesome for OSX users to migrate to Linux at one point or another. The specs are almost all out there, the code as well. Unfortunately I’m no expert in filesystems and I lack the time to invest on the matter.