Note that the tone of this blog post reflects on a Flameeyes of 2010, and does not represent the way I would be phrasing it in 2020 as I edit typos away. Unfortunately WordPress still has no way to skip the social sharing of the link after editing. Sorry, Ryan!
Seems like people keep on expecting Ryan Gordon’s FatELF to solve all the problems. Today I was told I was being illogical by writing that it has no benefits. Well, I’d like to reiterate that I’m quite sure of what I’m saying here! And even if this is likely going to be a pointless expedient, I’ll try to convey once again why I think even just by discussing that we’re wasting time and resources.
I have to say, most of the people pretending that FatELF is useful seem to be expert mirror-climber, so they changed so many ideas on how it should be used, where it should be used, and which benefits it has, that this post will jump from point to point quite confusingly. I’m sorry about that.
First of all, let me try to make this clear: FatELF is going to do nothing to make cross-arch development easier. If you want easier cross-arch or cross-OS development, you go with interpreted or byte-compiled languages such as Ruby, Python, Java, C#/.NET, or whatever else. FatELF focuses on ELF files, which are produced by the C, C++, Fortran compilers and the like. I can’t speak for Fortran as it’s a language that I do not know, but C and C++ datatypes are very much specific to architecture, operating system, compiler, heck even version of the libraries, system and 3rdparty! You cannot solve those problems with FatELF, as whatever benefits it has, they can only appear after the build. But at any rate, let’s proceed.
FatELF supposedly make build easier, but it doesn’t. If you really think so you have never ever tried building something for Apple’s Universal Binary. Support for autoconf and most likely any other build system along those lines, simply suck. The problem is that whatever results you get from a test in one architecture might not have the same result in the other. And Apple’s Universal Binary only encompass an operating system that has been developed without thinking too much of compatibility with others, and was under the same tight controls, where the tests for APIs are going to be almost identical for all the arches. (You might not know, but Linux’s syscall numbers are not the same across architectures; the reason is that they are actually designed to partly maintain compatibility with proprietary (and non) operating systems that originated on that architecture and were mainstream at the time. So for instance on IA-64 the syscall numbers are compatible with HP-UX, while on SPARC are compatible with Solaris, for the most part.)
This does not even come close to consider the mess of the toolchain. Of course you could have a single toolchain patched to emit code for a number of architectures, but is that going to work at all? Given that I have actually worked as a consultant building cross-toolchains for embedded architectures, I can tell you that it’s difficult enough to get one working. Count the need for patches for specific architectures, and you might start to get part of a pictures. While binutils already theoretically supports a “multitarget” build that adds in one build the support for all the architectures that they have written code for, doing the same for gcc is going to be a huge mess. Now you could (as I suggested) write a cc
frontend that takes care of compiling the code for multiple architectures at the time, but as I said above it’s not easy to ensure the tests are actually meaningful between architectures, let alone operating systems.
FatELF cannot share data sections. One common mistake to make thinking about FatELF is that it only requires duplication of executable sections (.text
), but that’s not the case. Data sections (.data
, .bss
, .rodata
) are dependent on the data types, which as I said above are architecture dependent, and operating system dependent, and even library dependent. They are part of the ABI; each ELF you build for a number of target arches right now is going to have its own ABI, so the data sections are not shared. The best I can think of, to reduce this problem, is to make use of -fdata-sections
and then merge sections with identical content; it’s feasible, but I’m sure that at the best of cases is going to create a problem with caching of near objects, and the best it’s going to cause misalignment of data to be read in the same pass. D’uh!
Just so you know how variable are the data sections: even though you could just use #ifdef
, both the Linux kernel and the GNU C Library (and most likely uClibc as well even though I don’t have it around to double-check it) install different sets of headers; this should be an indication of how different the interfaces are between them.
Another important note here: as far as I could tell from the specifics that Ryan provided (I really can’t be arsed to look back at them right now), FatELF files are not interpolated, mixing the sections of them, but merged in a sort-of archive, with the loader/kernel deciding which parts of it will be loaded as a normal ELF. The reason for this decision likely lies in one tiny winy detail: ELF files were designed to be mapped straight from disk to data structures in memory; for this reason, ELF have classes and data orders. For instance x86-64 uses ELF of class 64 and order LSB (Least-significant bit first, or little-endian) while PPC uses ELF of class 32 and order MSB (Most-significant bit first, or big-endian). It’s not just a matter of the content of .text
but it’s also pervasive in the index structures within the ELF file, and it is so for performance reasons. Having to swap all the data or deal with different sizes is not something you want to do in the kernel loader.
When do you distribute a FatELF? This is one tricky question because both Ryan and various supporters change opinion here more often they change socks. Ryan said that he was working on getting an Ubuntu built entirely of FatELFs, leaving to intend that distributions would be using FatELF for their packaging. Then he said that it wasn’t something for packagers but for Indipendent Software Vendors (ISVs). Today I was told that FatELF simplifies the distribution by distributing a single executable that can be “executed in place” rather than having to provide two of them.
Let’s be clear: distributors are very unlikely to provide FatELF binaries in their packages. Even though it might sound tenting to implement multilib with them, it’s going to be a mess just the same; it might reduce the size of the whole archive of binary packages, because you share the non-ELF files between architectures, but it’ll increase the used traffic to download them, and while disk space is mostly getting cheaper and cheaper, network traffic is still a rare commodity. Even more so, users won’t like to have installed stuff for architectures that they don’t use, and will likely ask for a way to clean them up, at which point they’ll wonder why they are downloading it at all. Please note that while Apple did their best to convince people to use Universal Binary, a number of software was produced to strip the alternative architecture files from their executable files at all.
Today I was offered, as I said, that it is easier to distribute one executable file rather than two. But when are you found doing that at all? Quite rarely; ISVs provide pre-defined packaging, usually in form of binary packages for the particular distribution (this already makes it a multiple-file download). Most complex software will be in an archive anyway because you don’t just build everything in but rather put it in different files: icons, images, … even Java software that actually use archives as main object format (JAR files are ZIP files), ships in archives installing separate data files, multiple JARs, wrapper scripts and so on so forth. I was also told that you could strip the extra architectures at install time, but if you do so, you might as well decide which of multiple files to install, making it moot to use a fat binary at all.
All in all, I still have to see one use case that actually can be solved by FatELF better than a few wrapper scripts and an archive. Sure you can create some straw-man arguments where FatELF works and scripts don’t, such as the “execute in-place” idea above, but really tell me when was the last time you needed that? Please also remember that while “changes happen everytime”, we’re talking about changing in a particularly invasive way a number of layers:
- the kernel;
- the loader;
- the C library (separated from the loader!);
- the compiler, has almost to be rewritten;
- the linker, obviously;
- the tools to handle the files;
- all the libraries that change API among architectures.
Even if all of this became stock, there’s a huge marginal cost here, and it’s not going to happen anytime soon. And even if it did, how much time is going to take before it gets mainstream enough to be used by ISVs? There are some that sill support RHEL3.
There are “smaller benefits”, as, again, I was told before, and those are not nothing. Maybe that’s the case but the question is “is it worth it?”. Once in the kernel is not going to take much work at runtime, but is it work the marginal cost of implementing all that stuff and maintaining it? I most definitely don’t think so. I guess the only reason why Apple coped with that is that they had most of the logic code already developed and laying around from when they transitioned from M68K to PowerPC.
I’m sorry to burst your bubbles, but FatELF was an ill-conceived idea, that is not going to gain traction for a very good reason: it makes no sense! Any of the use-cases I have read up to now are straw-men, that either resemble what OSX does or what Windows does. But Linux is neither. Now, let’s move on, please?
I agree. If you want to build a cross-platform application use any of the many languages. C/C++ are here because they are fast and can go down to the bones, not for cross-platform execution.
This debate cannot be closed until we don’t have to ask users what is their architecture. Asking such a pointless and dumb technical detail to make software work is just raw and acceptable to many.But people that supports FatELF don’t want a perfect solution that will correctly address all the problems and corner cases magically, they want a possibility, for the ones that worry about that and want to do the extra effort and fit this solution in their closed use case (eg distributing a game to 32/64Bit ubuntu users in a single non distinguishable way without the nightmares of scripting languages), to be able to achieving some goal that, sadly, needs a rework that goes deep into the operating system.But still, my goal, as a developer, is to satisfy my users in any possible ways, with no compromise. Your goals maybe other, your measure of quality might be have other priorities, but i see no reason why a technological problem should be a burden for my users in any way.
uname -m and the if statement is all you need to know about scripting.
I’m fairly certain Diego has written more on the topic of FatELF than I have at this point. :)I’ve resisted replying each time you’ve brought it up, but since you _do_ keep bringing it up, and take some unnecessary swipes at me every time (“drama queen,” indeed), I wanted to pop in and give a few clarifications, for what it’s worth.* FatELF was _never_ intended to make building easier. Your claim about autoconf vs Universal Binaries has some merit, but to be fair, next-generation build tools handle this better, and Mac OS X’s popularity has helped clean up this problem across many open source packages. I would argue in the case of autoconf, the problem is autoconf. Perhaps this attitude sounds heretical, but it’s really just one more aggravation with that specific build tool in a long list of aggravations, which is why packages outside the official GNU Project seem to be migrating to CMake, SCons, etc.* I wouldn’t describe the FatELF changes as invasive. The kernel (etc) continues to work as it did, taking one extra branch (and not many more lines of code) for a FatELF header. If it sees an ELF binary, everything works as normal. The kernel changes can work without the glibc changes (but you lose FatELF shared libraries and FatELF dlopen()), which can work without the binutils changes (but you lose the ability to link against FatELF libraries at build time), which can work without the compiler, etc. I can’t see a way where the changes could have been more minimal, or changes could be introduced more incrementally.* You are correct, FatELF binaries don’t share data sections, for the exact reasons you listed: alignment, byte order, and mmap()ing. Not only are the standard ELF headers different between architectures, but you can’t ever rely on the actual binary’s data sections to match at all. I made a reasonable assumption that most binaries don’t have an enormous amount of data in them, and it wasn’t worth the effort to try and merge a few matching bits. The kernel and glibc changes to support something like that would _definitely_ qualify as invasive. That FatELF doesn’t share data between records isn’t a serious flaw in my mind. For what it’s worth, Apple’s Universal Binaries don’t, either.* I have changed my socks, but not my stance: you ship FatELF binaries when it makes sense to do so. I gave several examples and was pretty clear in saying that what works for one person doesn’t work for everyone…this is the Linux way, after all. The Vmware virtual machine with two Ubuntu installs glued together was meant to be a proof-of-concept on a large scale. You _could_ ship an entire distro with every binary being a FatELF (after all, this is precisely what Mac OS X does), but _should_ you? I would say, no, probably not. But maybe we should ship all the libraries that Ubuntu currently ships as ia32 “compatibility” packages this way, at a minimum. Maybe the guy with the USB stick full of software (an example laughed down on the kernel mailing list, despite the popularity of repackaging SINGLE ARCH Windows software as “portable apps” that can run without installation) doesn’t care about _any_ of his system-wide Linux software being FatELF, so long as he can launch his own FatELF apps from whatever workstation he sits down at.* Apple provides a utility to strip unnecessary arches from Universal Binaries (and so does FatELF). The hardcore can certainly use that. Apple, however, didn’t have to try hard to “convince” people to use them. People just did because it wasn’t a big deal and everything just worked. If we’re seriously talking about straw men, I just simply don’t see this resistance to slightly bigger files. ELF binaries are simply not the bulk of the average hard drive’s contents. Hard drives are getting bigger, cheaper and faster all the time, and so is bandwidth. There are exceptions to the rules, of course, but third world countries transmitting RPM packages by carrier pigeon should complain about the bloat of OpenOffice in general before they complain about OpenOffice having two ELF binaries in the file.* How long would it take to get mainstream? I don’t know if I have a good answer for that. Before ISVs could count on FatELF support? At least a year or two if it went into the kernel right now. But how long before you could count on anything we come to count on? (say, dbus? a new version of gtk+? System calls for thread affinity?) Time solves all these; it’s not a reason to NOT do something. The good thing about FatELF is that it doesn’t rely on a complete cutover, like a.out to ELF more or less did. If a distro starts using it to solve some compatibility problems, then it is useful as soon as they start using it and it doesn’t need anything external to catch up. Immediate win.* You (and lots of other people) keep referring to FatELF as a code maintenance nightmare, like it mangles multiple kernel subsystems. It doesn’t. It’s a small, clean solution, with more documentation than code, and I’m confident that anyone could become intimately familiar with it in minutes if I were to be hit by a bus. You would spend an order of magnitude more time learning the ELF code in the kernel or glibc than you would spend learning the FatELF portions of it.* Where do people use execute-in-place? I use it for installers for downloadable games that I ship. It’s not my fault that Linux distros couldn’t agree on a single package format, and I’m not rolling .debs, .rpms, whatever else plus a tarball to catch the rest of the distros. That’s an enormous amount of developer time on my part to make sure that works well for each project, and multiple confusing downloads for end users. If we did a fully “fat” system like my proof of concept virtual machine, that’d be a whole lot of execute-in-place. If you have some software on a USB drive or a network share, there’s some more.* The shell script thing keeps coming up. Sure, you can launch your app with a shell script and choose the right binary, but, um, you’re launching a WHOLE SHELL INTERPRETER to run two lines of code. Also, it can’t predict that it should launch the “i686” binary on an “x86_64” machine, and if it knows about that, what happens when someone launches a CPU named “asdjas923r1” in the distant future that has an x86 compatibility layer? The shell script fails, even though the machine could have run the existing binary. Also, the shell script won’t let you dlopen() the right files at runtime. Also, do you really want a shell script for every binary if you wanted to ship a complete “Fat” distro? The shell script idea is ghetto, and it only “works” (and I use that term loosely) for third party software.* You keep using the term “straw man” for “ideas that aren’t personally interesting to me” … there are perfectly legitimate scenarios where a shell script isn’t a workable solution (and many more where it’s simply not an _elegant_ solution), but you hand-wave these away as “straw man” arguments. This is true for several other points, too. Your existing Linux system works great for you, but it doesn’t mean it works perfectly for everyone. To suggest that no one has presented a scenario where fat binaries are useful, when Apple is quite successfully selling an entire OS of fat binaries for 129 dollars a copy (and–in the case of the iPhone 4 and iPad–selling a fat binary system literally as fast as they can produce them) is simply willful ignorance.* Apple coped with fat binaries because they had a need to, of course, but what we now refer to as “Universal Binaries” was baked into the OS back in the early days of NextStep…and having it available to them has saved their asses on _three_ occasions so far (PPC to PPC64, x86, and x86-64…this isn’t even counting Mac Classic 68k/ppc, NextStep’s multi-arch support, or some unknown but no less inevitable iPhone CPU change in the future). Their needs and problems are not exactly aligned with Linux’s, of course, but surely you can see some overlap?Finally, I’d like to thank you for understanding this tech more than most people (both for and against FatELF) bothered to, and say that I’m sorry if people are harassing you about your opinion; you’ve said as such in several blog posts and tweets now. You aren’t voicing outrageous opinions that haven’t been said by many others, and I think it’s unfortunate if people are shouting you down without considering your points. It’s a feeling I’ve experienced myself.–ryan.icculus@icculus.org
Thank you for answering this time, Ryan!I guess the reason why I keep bringing up the subject is that I do understand the need to polish quite a few things that you touch around with FatELF – although especially those that people expect to magically improve, and you as well know they won’t – but I don’t think this solution will work that well.My reason to describe the changes as invasive is not as much for the kernel changes, but more for the fact that you have to change *all* the layers to accomplish something. You say that Apple has a tool to split the files (yes I knew that), I’ll add another thing that you also know, that @lipo@ also let you stitch two together… I guess if I were to think how to tackle the problem, I would have ended up with just a @lipo@-clone and a special loader (or even a MISC-based loader) in the kernel.The big problem in my opinion still lays a lot in the build; whether you like it or not autoconf is not going away because SCons and CMake have a number of defect. I know autoconf enough to both to make “good use”:https://autotools.info/ and to know its vulnerabilities — this is one of them and not even the worse. Plus making GCC behave is going to hurt, a lot. I guess this is one big problem with the kind of people who seem to enjoy your idea: they expect this to “magically” solve the cross-build, cross-testing problem.And while I accept you have a clear-cut good use of FatELF – the games – most of the _other_ people seem to make up thing as they go to want this around. The latest I heard was for a system administrator to upload and execute a command over an heterogeneous network of boxes.. I guess even you would agree that this is a job for scripting languages rather than universal binaries. I counted you in the “changed opinion” front about the conflict regarding the statement over distributions.I have to say that I agree you shouldn’t be providing packages for various distributions; I’d argue that it is distributions that should take care of making your software available to users… of course it’s not as easy, but as far as I can tell, Gentoo should be set up quite well in your regard, am I wrong? More upstreams should take the right step to provide eventual changes for the distributions to pick them up properly, this was the reason of my “three articles series”:http://lwn.net/Articles/274763 two years ago. That’s an area we have to improve on both sides of the fence. And itself would probably reduce the need for users to download your packages directly. Even better, it would be to add an URL scheme that allows to fire-up the package manager for the distribution, something like @yum://$repo/package?install@ — I have to say I don’t know if there is anything at all like that for other distributions, I’m quite sure there is not for Gentoo.But I still have doubts regarding the execute-in-place usage that is the core of your usage idea, so let me try to articulate that properly: * As you said, ELF files are but a percentage of a system’s disk usage, as they are now. Also, the data that you _could_ share between any two of them is very limited and it’s not worth messing up the format any more than it is (and Tux help us, the ELF format especially as extended by GNU is a mess). On the other hand, applications usually ship with a number of data files.. I’m quite sure your games do as well: textures, levels, models, … does that allow to have a single file to run the game at all? Or does it rather call for a more complex install/unpack phase which could take care of running an arch independent script that decides which ELF files to install? * Actually, more interesting than their Universal Binaries, I would look at Apple’s application Bundles; you have a pre-defined directory structure where data is poured into… they do execute those in place, but they don’t have a single file to load; the reason why I said that I’m surprised Apple went with fat binaries at all is that, were it for me, I would have just changed the code that launches the bundle to load a different binary depending on architecture from within that structure — I can guess that Terminal users might have been among the causes why they didn’t go with that option, and another would be the fact that they had already developed and successfully used them in other times, as you also said. But wouldn’t it make more sense for your usage to have GNOME and KDE applications to handle similar bundles? Then the code to select the architecture to launch for any host would lie within a single userland application, rather than having to be replicated in scripts. I’ll hand-wave @dlopen()@ problems right now, I could dig into them, but it’s sincerely a bit too long for a comment.A whole new page should be dedicated to discuss the problem within the specific realm of x86 and amd64… I still don’t think FatELF is the proper solution but I agree we have an absolute mess there; the “compatibility layers” are thin as hell in the best of cases, or simply hacks like it’s the case for Gentoo. Solaris (on SPARC, at least) has a *much* more complete approach, which is unsurprisingly. On the other hand, I know that at least a few people expect x86 to disappear from the face of the earth in the next year or so, and have been doing so for the past five years… while I would rejoice at the notion I disagree with its feasibility. To be honest, the reason why I’m enticed thinking about FatELF (but still turning it down) is most likely this, as I know how difficult this is to pull off properly.What it comes down to, at the end, is the marginal cost, though. Moving toward FatELF requires too much work, in my opinion, at least right now. And it does not solve a number of other problems that we’ve been having both in distributions and for final users: ABI stability, multi-architecture testing, build systems defects, … may there’ll be the time when we can actually have a new file format, from scratch, that supports this, but I think this is the wrong moment, and method.
As I understand it, a main advantage of FatELF is for third-parties: users have it easier to get the right download.It may sound strange, but it *is* challenging to many users to decide if they need the 64bit or 32bit version. Why? Because they don’t know if they have a 64bit ore a 32bit system, and *they should not need to*!In GNU/Linux the problem is far smaller than in Windows, because we do almost all things with a package manager which can decide for us what we need (just say „this program“) – or which automatically compiles the correct version.But as GNU/Linux moves more and more into the general consumer-area, it is great having the option to just offer one big download button.I know this sounds evil: “sacrifice disk space and network traffic, so we can display one download button instead of two and have the program be startable with one click”, but that’s something which would offer quite some benefit to casual users.If you distribute a game and 5% of your users leave without trying, because they can’t figure out which version to use (or because the version they downloaded doesn’t run), then that’s a huge lossage, because you have 5% disgruntled potential users. And every single user who says “doesn’t work for me” is bad press.
As it turns out, many packages contain architecture-specific data files. I point to the .mo files that remove a second, large axis from the matrix of distributable packages: localizations. To work with FatElf, you’ve got four bad choices:<ul><li>Have the package-initialization code install the correct .mo files for the architecture when the package installs or first runs.</li><li>Have program-initialization code select among the .mo files to enable the ones for the current architecture.</li><li>Forget gettext and bind the language strings into the binaries. This means you’ll have L language versions of your package rather than A arch-specific versions. Recall that for most packages, L is much larger than A.</li><li>Make your own modification to gettext so make it choose the correct .mo files at the intersection between the architecture version and the language version. You’ll also need a FatElf version of your new gettext libraries.</li></ul>Most of these solutions multiply the number of the .mo files in your package by A ✕ L.I’m sure that I have exhausted neither all the other ways that packages might use architecture-specific non-executable binary files nor the ways that packages handle internationalization.I’ll bet that that the internationalization issue is not so big for Apple because of its distribution model. Like other commercial OS vendors, Apple sells its operating systems only in single-language versions. That means they already have to support L versions of their packages. Commercial application developers follw this same model. In Gnu-land, by contrast, all L versions already come in each package. Depending on the package and distribution, the installer will install only the language versions the user wants to use.And all of this trouble to run games? I really don’t understand what’s so bad about running the system shell when you start your program. Lots of programs run Bash as part of their startup.Finally, my own bit of fluff. Aren’t elves supposed to be light, tiny, and lithe? Who ever heard of a fat elf?
Take a look at some of the high-profile packages like Firefox, Adobe Reader, Java, or Google Chrome. They’ve got a more difficult problem in that they have to support multlple executable-file architectures, not just ELF.Most of them guess at the architecture based on the browser’s user-agent string; all of them present (or display after one more click) the whole list of targets. Most of the time the user-agent string does a good-enough job at selecting the target.There are plenty of commercial installers which run a little stub installer which downloads the larger pieces as needed. This installs only the correct parts.Open-source-package installers could also use the stub-installer approach. The installer that the user gets when clicking the download would be a Bash script plus a set of statically linked wget executables in different architectures. The script would examine the environment and choose both the correct wget executable and the set of installation archives to download. The script would also be able to play nice with the system’s package manager–a feat which many of the click-here installers don’t manage.Now, no obese mythical beings and no expense (in time, bandwidth, and disk space) for extraneous content. Also–only one download button.
Mike I cannot go into the (correct and good) details you provided now right now as I’m going to sleep in a moment. I would just like to add that the problem of localisation has actually been attacked already by both Microsoft and Apple… they have solved it even though in not so extremely-great ways. In the case of Microsoft, though, the different-architecture point of view itself presents much less problem as they are (mostly) mono-architectural.Apple, though, does *not* sell OS X in different versions per each language; the DVD that ships with any Mac, or that you buy as upgrade, ships with all the supported languages… and the same is true for Vista Ultimate (and some versions of Vista as distributed by vendors such as Dell), with a few limitations.But the whole localisation support requires a different post I guess…
Arne, if a user doesn’t know whether he’s running 64 or 32 bit then he should only be using the package manager. Linux might not have all the security issues that windows has but it isn’t foolproof.