A few more reason why FatELF is not

Note that the tone of this blog post reflects on a Flameeyes of 2010, and does not represent the way I would be phrasing it in 2020 as I edit typos away. Unfortunately WordPress still has no way to skip the social sharing of the link after editing. Sorry, Ryan!

Seems like people keep on expecting Ryan Gordon’s FatELF to solve all the problems. Today I was told I was being illogical by writing that it has no benefits. Well, I’d like to reiterate that I’m quite sure of what I’m saying here! And even if this is likely going to be a pointless expedient, I’ll try to convey once again why I think even just by discussing that we’re wasting time and resources.

I have to say, most of the people pretending that FatELF is useful seem to be expert mirror-climber, so they changed so many ideas on how it should be used, where it should be used, and which benefits it has, that this post will jump from point to point quite confusingly. I’m sorry about that.

First of all, let me try to make this clear: FatELF is going to do nothing to make cross-arch development easier. If you want easier cross-arch or cross-OS development, you go with interpreted or byte-compiled languages such as Ruby, Python, Java, C#/.NET, or whatever else. FatELF focuses on ELF files, which are produced by the C, C++, Fortran compilers and the like. I can’t speak for Fortran as it’s a language that I do not know, but C and C++ datatypes are very much specific to architecture, operating system, compiler, heck even version of the libraries, system and 3rdparty! You cannot solve those problems with FatELF, as whatever benefits it has, they can only appear after the build. But at any rate, let’s proceed.

FatELF supposedly make build easier, but it doesn’t. If you really think so you have never ever tried building something for Apple’s Universal Binary. Support for autoconf and most likely any other build system along those lines, simply suck. The problem is that whatever results you get from a test in one architecture might not have the same result in the other. And Apple’s Universal Binary only encompass an operating system that has been developed without thinking too much of compatibility with others, and was under the same tight controls, where the tests for APIs are going to be almost identical for all the arches. (You might not know, but Linux’s syscall numbers are not the same across architectures; the reason is that they are actually designed to partly maintain compatibility with proprietary (and non) operating systems that originated on that architecture and were mainstream at the time. So for instance on IA-64 the syscall numbers are compatible with HP-UX, while on SPARC are compatible with Solaris, for the most part.)

This does not even come close to consider the mess of the toolchain. Of course you could have a single toolchain patched to emit code for a number of architectures, but is that going to work at all? Given that I have actually worked as a consultant building cross-toolchains for embedded architectures, I can tell you that it’s difficult enough to get one working. Count the need for patches for specific architectures, and you might start to get part of a pictures. While binutils already theoretically supports a “multitarget” build that adds in one build the support for all the architectures that they have written code for, doing the same for gcc is going to be a huge mess. Now you could (as I suggested) write a cc frontend that takes care of compiling the code for multiple architectures at the time, but as I said above it’s not easy to ensure the tests are actually meaningful between architectures, let alone operating systems.

FatELF cannot share data sections. One common mistake to make thinking about FatELF is that it only requires duplication of executable sections (.text), but that’s not the case. Data sections (.data, .bss, .rodata) are dependent on the data types, which as I said above are architecture dependent, and operating system dependent, and even library dependent. They are part of the ABI; each ELF you build for a number of target arches right now is going to have its own ABI, so the data sections are not shared. The best I can think of, to reduce this problem, is to make use of -fdata-sections and then merge sections with identical content; it’s feasible, but I’m sure that at the best of cases is going to create a problem with caching of near objects, and the best it’s going to cause misalignment of data to be read in the same pass. D’uh!

Just so you know how variable are the data sections: even though you could just use #ifdef, both the Linux kernel and the GNU C Library (and most likely uClibc as well even though I don’t have it around to double-check it) install different sets of headers; this should be an indication of how different the interfaces are between them.

Another important note here: as far as I could tell from the specifics that Ryan provided (I really can’t be arsed to look back at them right now), FatELF files are not interpolated, mixing the sections of them, but merged in a sort-of archive, with the loader/kernel deciding which parts of it will be loaded as a normal ELF. The reason for this decision likely lies in one tiny winy detail: ELF files were designed to be mapped straight from disk to data structures in memory; for this reason, ELF have classes and data orders. For instance x86-64 uses ELF of class 64 and order LSB (Least-significant bit first, or little-endian) while PPC uses ELF of class 32 and order MSB (Most-significant bit first, or big-endian). It’s not just a matter of the content of .text but it’s also pervasive in the index structures within the ELF file, and it is so for performance reasons. Having to swap all the data or deal with different sizes is not something you want to do in the kernel loader.

When do you distribute a FatELF? This is one tricky question because both Ryan and various supporters change opinion here more often they change socks. Ryan said that he was working on getting an Ubuntu built entirely of FatELFs, leaving to intend that distributions would be using FatELF for their packaging. Then he said that it wasn’t something for packagers but for Indipendent Software Vendors (ISVs). Today I was told that FatELF simplifies the distribution by distributing a single executable that can be “executed in place” rather than having to provide two of them.

Let’s be clear: distributors are very unlikely to provide FatELF binaries in their packages. Even though it might sound tenting to implement multilib with them, it’s going to be a mess just the same; it might reduce the size of the whole archive of binary packages, because you share the non-ELF files between architectures, but it’ll increase the used traffic to download them, and while disk space is mostly getting cheaper and cheaper, network traffic is still a rare commodity. Even more so, users won’t like to have installed stuff for architectures that they don’t use, and will likely ask for a way to clean them up, at which point they’ll wonder why they are downloading it at all. Please note that while Apple did their best to convince people to use Universal Binary, a number of software was produced to strip the alternative architecture files from their executable files at all.

Today I was offered, as I said, that it is easier to distribute one executable file rather than two. But when are you found doing that at all? Quite rarely; ISVs provide pre-defined packaging, usually in form of binary packages for the particular distribution (this already makes it a multiple-file download). Most complex software will be in an archive anyway because you don’t just build everything in but rather put it in different files: icons, images, … even Java software that actually use archives as main object format (JAR files are ZIP files), ships in archives installing separate data files, multiple JARs, wrapper scripts and so on so forth. I was also told that you could strip the extra architectures at install time, but if you do so, you might as well decide which of multiple files to install, making it moot to use a fat binary at all.

All in all, I still have to see one use case that actually can be solved by FatELF better than a few wrapper scripts and an archive. Sure you can create some straw-man arguments where FatELF works and scripts don’t, such as the “execute in-place” idea above, but really tell me when was the last time you needed that? Please also remember that while “changes happen everytime”, we’re talking about changing in a particularly invasive way a number of layers:

  • the kernel;
  • the loader;
  • the C library (separated from the loader!);
  • the compiler, has almost to be rewritten;
  • the linker, obviously;
  • the tools to handle the files;
  • all the libraries that change API among architectures.

Even if all of this became stock, there’s a huge marginal cost here, and it’s not going to happen anytime soon. And even if it did, how much time is going to take before it gets mainstream enough to be used by ISVs? There are some that sill support RHEL3.

There are “smaller benefits”, as, again, I was told before, and those are not nothing. Maybe that’s the case but the question is “is it worth it?”. Once in the kernel is not going to take much work at runtime, but is it work the marginal cost of implementing all that stuff and maintaining it? I most definitely don’t think so. I guess the only reason why Apple coped with that is that they had most of the logic code already developed and laying around from when they transitioned from M68K to PowerPC.

I’m sorry to burst your bubbles, but FatELF was an ill-conceived idea, that is not going to gain traction for a very good reason: it makes no sense! Any of the use-cases I have read up to now are straw-men, that either resemble what OSX does or what Windows does. But Linux is neither. Now, let’s move on, please?