What could have been. A time travel story of x32 and FatELF

I was toying around with the idea to write about this for a week or two by now, and I discussed it with Luca as well, but my main issue has been coming up with a title this time.. the original title I had in mind was a Marvel-style What if… “… x32 and FatELF arrived nine years ago”. But then Randall beat me to it with his hypothetical question answering site.

So let’s put some background in context; I criticised in the past Ryan’s FatELF idea and recently Intel’s x32 — but would I have criticised them the same way if the two of them came up almost together nine years ago? I don’t think so; and this is the main issue with this kind of ideas: you need the perfect timing or they are not worth much. Both of them came out at the wrong time in my opinion.

So here’s my scenario that didn’t happen and can’t happen now.

It’s 2003, AMD just launched their Opteron CPU which is the first sporting the new x86-64 ISA. Intel, after admitting to the failure of their Itanic Itanium project, releases IA-32e following suit. At the same time, though, they decide that AMD’s route of pure 64-bit architecture for desktop and servers is going to take too long to be production-ready, especially on Linux, as even just discussing multlib support is making LSB show its frailty.

They decide to thus introduce a new ABI, called x32 (in contrast with x64 used by Sun and Microsoft to refer to AMD’s original ABI). At the same time they decide to hire Ryan Gordon, to push forward a sketchy idea he proposed, about supporting Apple-style fat binaries in Linux — the idea was originally ignored because nobody expected any use out of a technique used in 1994 to move from M68k to PowerPC and then left to die with no further use.

The combined energy of Intel and Ryan came up with a plan on how to introduce the new ABI, and thus the new architecture, in an almost painless way. A new C library version is to be introduced in Linux, partially breaking compatibility with the current libc.so.6, but this is for the greatest good.

The new version of the C library, glibc 3.0, will bring a libc.so.7 for the two architectures, introducing a few incompatibility in the declarations. First of all, the largefile optional support is being dropped: both ABIs will use only 64-bit off_t, and to avoid the same kind of apocalyptic feeling of Y2K, they also decided to use 64-bit clock_t types.

These changes make a very slight dent into the usual x86 performance, but this is not really visible in the new x32 ABI. Most importantly, Intel wanted to avoid the huge work required to port to either IA-64 or AMD64, by creating an ILP32 ABI — where int, long and void * are all 32-bit. And here’s where Ryan’s idea comes to fruition.

Source code written in C will compile identically between x86 and x32, and thanks to the changes, aligning the size of some primitive standard types, even more complex data types will be identical between the two. The new FatELF extended format introduced by glibc 3.0 leverages this — original x86 code will be emitted in the .text section of the ELF, while the new code will live in .text32 — all the data, string and symbol tables are kept in single copy only. The dynamic loader can then map one or the other section depending on whether the CPU supports 64-bit instructions and all the dependencies are available on the new ABI.

Intel seems to have turned the tables under AMD’s nose with their idea, thanks to the vastly negative experience with the Itanium: the required changes to the compiler and loader are really minimal, and most of the softare will just build on the new ABI without any extra change, thanks to maintaining most of the data sizes of the currently most widespread architectures (the only changes are off_t behaving like largefile is enabled by default, and then clock_t that got extended). Of course this still requires vast porting of assembly-ridden software such as libav and most interpreter-based software, but all of this can easily happen over time thanks to Ryan’s FatELF design.

Dissolve effect running

Yes, too bad that Intel took their dear time to enter the x86-64 market, and even longer to come up with x32, to the point where now most of the software is already ported, and supporting x32 means doing most of the work again. Plus since they don’t plan on making a new version of the C library available on x86 with the same data sizes as x32, the idea of actually sharing the ELF data and overhead is out of question (the symbol table as well, since x86 still have the open/open64 split which in my fantasy is actually gone!) — and Ryan’s own implementation of FatELF was a bit of an over-achiever, as it doesn’t actually share anything between one architecture and the other.

So unfortunately this is not something viable to implement now (it’s way too late), and it’s not something that was implemented then — and the result is a very messed up situation.

12 thoughts on “What could have been. A time travel story of x32 and FatELF

  1. So, why do you think people are dealing with x32 ? Just to have fun or what else forces people to move to the 32bit way.


  2. As I said in my previous post, it really feels like Intel is trying to hide their bad design in the Atom, and Google has probably enough narrow-enough use cases where x32 makes a difference. If you notice, Intel’s own design admitted that x32 is a fit for a closed circuit system — and my refuting the validity of their benchmark is based on the idea of x32 having that kind of improvement on real world desktop and server systems.People have pointed out in LWN comments that there are certain (again, narrow) use cases for which the pointer chasing represent a vast proportion of the time spent, but that’s not your average desktop. If anything, I think people right now are hyped by Intel’s boast of 42% improvement (over a very outlandish benchmark).After all, those “people” currently are composed of people enthusiast of something they don’t even fully understand, together with people from Intel (H. J. Lu) and Google (Mike). I honestly don’t see so many people actually interested in supporting x32 any time soon on average usage patterns.


  3. In my experience, people choosing x86 on amd64 computers have always been misguided.At least x32 takes the good things from amd64 and still lets them have their “pointers take less space” thingy. But I really don’t want to support that shit.


  4. What I can say about this is that 2Gigs of RAM was perfectly ok for my development desktop system, where I ran a lot of open tabs in browser (including Flash), Eclipse? Skype and such. Then I switched to 64 bits and all the memory disappeared suddenly. I used all the memory available and very often swap became full (something like 1 Gig of swap space).So I don’t I could see that difference when I switched from 32 bits to 64 bits on the same machine. However I haven’t made any specific calculations – only noticed the total lack of free memory on my machine.


  5. Here are my personal calculations with 2 desktop applications which I have been able to get to work on x32:USER PID PPID C START TT TIME RESMEM VIRTUAL PRI S COMMAND COMMANDdevsk 568 32618 0 20:11 pts/15 00:00:00 39828 229516 19 S eog eogroot 32737 28261 0 20:10 pts/26 00:00:00 16920 31796 19 S eog eogUSER PID PPID C START TT TIME RESMEM VIRTUAL PRI S COMMAND COMMANDroot 20268 19031 0 20:52 pts/26 00:00:00 14952 40012 19 S gnome-terminal gnome-terminaldevsk 20637 32618 0 20:57 pts/15 00:00:00 36896 302612 19 S gnome-terminal gnome-terminalThose are substantial memory savings. Over the whole set of desktop apps, I will end up saving a lot of RAM. A 2GB system felt great before I jumped to amd64. Not anymore. This could solve that.


  6. _A 2GB system felt great before I jumped to amd64. Not anymore. This could solve that._So, instead to spend ~20$ for an extra 2GB, it’s better to introduce a new ABI for x86-64 Linux and break every possible compatibility with everything? I still prefer to buy more RAM, that solve the problem.


  7. @equilibrium: Yes, it is easier for all of us computer owners (~10^9) to spend ~20$ on extra 2GB ram, for a total of just 20,000,000,000$.


  8. equilibrium: I am tired of listening to the “upgrade your RAM” argument. There are people who can not buy new laptops or PCs and are stuck with laptops and PCs which can not take more than 2GB of RAM or even 1GB of RAM. There are too many people like that. You know the non-geek non-rich normal joe folks who don’t buy the iPad every time new version of it gets released.In today’s world of use-and-throw-six-months-cycle, the planet earth can use x32….:-)


  9. And I’m honestly tired of *people speaking without a clue* and expecting x32 to solve all the world’s problem. If you really expect x32 to cut the memory usage _that_ much you’re either naïve or an idiot, seriously.You know why your average amd64 desktop uses more ram than an x86? Because you want to use Skype, or Firefox (bin), or whatever other 32-bit application, which means you’re loading most libraries _twice_ thanks to multilib. And it’s going to be the same because _x32 is not binary compatible with x86_.You know what “today’s world” needs? That people like you don’t end up in decision-making position.


  10. Talk of flames….:)I just provided you numbers above. Did u forget look at those? x32 reduces the memory footprint to half. And those processes don’t incur the multilib overhead you talk of. They are pure 64-bit and x32 processes with all native libraries as dependencies.But it seems like you have taken a position and you won’t budge from it even if a truck hit you with load full of info. Anyways, good luck with all that anger management…:)


  11. Did you look at your own numbers?According to them x32 is supposed to reduce virtual RAM usage more than 7x for an image viewer! 2x would be the theoretical maximum you could reasonably expect (ignoring serious mis-designs/bugs), and even that only in corner cases.For an image viewer which should have a very, very tiny amount of data in pointer or long types anything close to 2x memory savings doesn’t pass a basic sanity check (even RESMEM is significantly above that 2x), 7x claims for me rings the “fanboy too blinded to see any contradictions” bell.At the very least, it means those numbers are worthless for this argument, unless you can come up with a proper explanation _and_ that explanation is linked to some intrisic property of x32 (if it’s some application or OS misdesign that can and needs to be fixed regardless of x32).


  12. Hi Diego (Flameeyes),Nice job taking the time to check on this at this early stage.However, knowing how defensive companies are and also how techies/Linux users love to play with new stuff, I don’t think your posts will have that much effect at this point.People need to see and evaluate stuff themselves. So They will go forward with this anyway.I guess people want that 1 platform to finally get rid of 32 vs 64. Not sure if x32 will be it so I will also wait and see.So a year (or 2) from now people will either prefer this architecture over the old ones or you’ll be able to tell them “I told you so”. Either way most people need to make their own mistakes and learn the hard way.But is all good, its part of the journey, cheers!


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s