Apple’s HFS+, open-source tools, and LLVM

The title of this post seems a bit messed up, but it’ll make sense at the end. It’s half a recount of my personal hardware trouble and half a recount of my fighting with Apple’s software, and not of the kind my reader hate to read about I guess.

I recently had to tear apart my Seagate FreeAgent Xtreme external HDD. The reasons? Well, beside leaving me without a connection while using it (with Yamato) on eSATA, and forcing me to use either Firewire or USB (both much slower — and I did pay it to use eSATA!), yesterday it decided it didn’t want to let me access anything via either of the three connections, not even after a number of power cycles (waiting for it to cool down as well); this was probably related to the fact that I tried to use it again as eSATA, connected to the new laptop to try copying an already set-up partition from the local drive to make space for (sigh) Windows 7.

Luckily, there was no data worth spending time on, in that partition, just a few GNOME settings I could recreate in a matter of minutes anyway.

Since the Oxford Electronics-based bridge on the device decided not to help me out to get my data back, I decided to break it up, with the help of a Youtube video (don’t say that Youtube isn’t helpful!), and took the drive itself out, which is obviously a Seagate 7200.11 1TB drive, quite a sturdy one to look at it. No I won’t add it at the 7th disk drive to Yamato, mostly because I fear it wouldn’t be able to start up anymore if I did so.

Thankfully, I bought a Nilox-branded “bay” a month or so ago, when I gave away what remained of Enterprise to a friend of mine (the only task that Enterprise was still doing was saving data out of SATA disks when people brought me laptops or PCs that fried up. My choice for that bay was due to the fact that it allows you to plug in both 3.5” and 2.5” SATA disks without having to screw them anywhere. It does look a lot like something out of the Dollhouse set, to be honest, but that doesn’t matter now.

I plugged it in, and started downloading the data; I can’t be sure it is all fine, so I deleted lots and lots of stuff I won’t be safe about for a while. Then I shivered, fearing the disk itself was bad, and that I had no way to check it out… thankfully, the bay uses Sunplus electronics in it, and – lo and behold! – smartmontools has a driver for the Sunplus USB bridge! A SMART test later, and the disk turns out to feel better than any other disk I ever used. Wow. Well, it’s expected as I never compiled on it.

Anyway, what can I do with a 1TB SATA disk I cannot plug into any computer as it is? Well, actually one thing I can do: backup storage. Not the kind of rolling backup I’m currently doing with rsnapshot and the WD MyBook Studio II in eSATA (anything else is just too slow to backup virtual machines), but rather a fixed backup of stuff I don’t expect to be looking at or using anytime soon. But to be on the safe side, I wanted to have it available in a format I can access, on the go, with the Mac as well as from Linux; and vfat is obviously not a good choice.

The choice is, for the Nth time, HFS+. Since Apple has published quite a bit of specs on the matter, the support in Linux is decent, albeit far from being perfect (I still haven’t finished my NFS export patch, it does not support ACLs or extended attributes, and so on). It’s way too unreliable for rsnapshot (with hardlinking) but should work acceptably well for the storage.

The only reason I have not to use it for something I want to rely on, as it is, is that the tools for filesystem creationa nd check (mkfs and fsck) are quite a bit old. I’m not referring to “hfsutils” or “hfsplusutils” both of which are written from scratch and have a number of problems, including but not limited to, shitty 64-bit code. I’m referring to the diskdev_cmds package in Gentoo which is a straight port of Apple’s own code, which is released as FLOSS under the APSL2 license.

Yes, I call that FLOSS! You may hate Apple as much as you wish, but even FSF considers APSL2 a Free Software license albeit with problems; on the other hand they explicitly state this (emphasis mine):

For this reason, we recommend you do not release new software using this license; but it is ok to use and improve software which other people release under this license.

Anyway, I went to Apple’s releases for 10.6.3 software (interestingly they haven’t published yet 10.6.4 which was released just the other day), and downloaded diskdev_cmds, and the xnu package that contains their basic kernel interfaces, and I started working on an autotools build system to make it possible to easily port the code in the future (thanks to git and branching).

The first obstacle, beside the includes obviously changing, was that Apple decided to make good use of a feature they implemented as part of Snow Leopard’s “Grand Central Dispatch”, their “easy” multi-threading implementation (somewhat similar to the concept of OpenMP): “blocks”. Anonymous functions for the C language, an extension they worked in LLVM. So GCC straight is unable to build the new diskdev_cmds. I could either go to fetch an older diskdev_cmds tarball, from Leopard rather than Snow Leopard, where GCD was not implemented yet, or I could up the ante and try to get it working with some other tools. Guess what?

In Gentoo we already have LLVM around, and the clang frontend as well. I decided to write an Autoconf check for blocks support, and rely on clang for the build. Unfortunately it also needs Apple’s own libclosure, that provides some interfaces to work with blocks. And the basis for the GDC interface. It actually resonated a bit when Snow Leopard was presented because Apple released it for Windows as well, with the sources under MIT license (very liberal). Unfortunately you cannot find it in the page I linked above but you have to look at 10.6.2 page for whatever reason.

I first attempted to merge this straight in the diskdev_cmds sources, but then I guessed that it makes more sense to try porting it alone, and make it available, maybe somebody will find some good use for it. Unfortunately the task is not as trivial as it looks. The package needs two very simple functions for “atomic compare and swap” which OS X provides as part of its base library, and so does Windows. On Linux, equivalent functions are provided by HP’s libatomic_ops (you probably have it around because of PulseAudio).

Unfortunately, libatomic_ops does not build, as it is, with clang/LLVM; there is a mistake in the code, or the way it’s parsed; it’s not something unexpected given that inline assembler is a lot compiler-dependent. In this case it’s a size problem: it uses a constraint for integer types (32-bit) but a temporary (and same-sized input) of type unsigned character (8-bit). The second stop is again libatomic_ops’s problem: while it provides an equivalent interface to do atomic compare and swap for long types, it doesn’t do so for int types; that means it works fine on x86 (and other 32-bit architectures where both types are 32-bit) but it won’t do for x86-64 and other 64-bit architectures. Guess what the libclosure code needs?

Now of course it would be possible to lift the atomic operations out of the xnu code, or just write them straight, as libatomic_ops already provides them all, just not correctly-sized for x86-64 but the problem remains that you then have to add a number of functions for the various architecture rather than having a generic interface; xnu provides functions only for x86/x86-64 and PPC (since that’s what Apple uses/used).

And where has this left me now? Well, nowhere far, mostly with a sour feeling about libatomic_ops inability to provide a common, decent interface (for those who wonder, they do provide char-sized inlines for compare and swap for most architecture, and even the int-sized alternatives that I was longing for… but only for IA-64. You wouldn’t believe that until you remembered that the whole library is maintained by HP.

If I could take the time off without risking trouble, I would most likely try to get better HFS+ support in Linux, if only to make it easier and less troublesome for OSX users to migrate to Linux at one point or another. The specs are almost all out there, the code as well. Unfortunately I’m no expert in filesystems and I lack the time to invest on the matter.