Shortcomings of our multilib/FHS filesystem layout

Flameeyes

20 years ago

Today I’m taking a day off from Gentoo to finish my slides, I really love LaTeX-beamer to do the work, I don’t like doing the work tho. I’m not really the kind of person who creates colourful and full of graphics presentations, I’m more an experience person. Anyway, I’m almost done luckily, and I can hopefully go prepared to the course.

For this reason, today I have nothing to write about changes in course and similar stuff, but I do have something I was thinking about a lot in the past months and that I should probably share with the readers of Planet Gentoo and of my blog.

For who don’t know what multilib is, it’s a technique that allows to run two (or more) different and incompatible kinds of userland sunder the same kernel, if the kernel is able to run both of those userland. The classic example is Gentoo/AMD64, that comes by default with a multilib setup that provides both 64-bit and 32-bit libraries, allowing to run 32-bit binaries, such as OpenOffice or VMware player, even if the main system is running 64-bit.

This is of course an advantage, but comes with some costs, and those are mainly the requirement of installing two times the stuff needed, one for 64-bit and one for 32-bit. Right now portage does not support the so-called true multilib so we have to use pre-built binary packages containing the 32-bit libraries (the app-emulation/emul-linux-x86-* packages), while glibc and gcc both have a multilib-aware build system so they are built from source correctly.

But why I talk of shortcomings? Because we had in the time a few problems, especially now we have some problems with Wine and the current eselect-compiler, problems that I actually found and reported quite some time ago, unfortunately this haven’t been solved yet. The cause is that the current support for 32-bit building in portage is largely an hack that works only when correctly adjusted for the platform, not generic enough, in my opinion.

The reason of this, as I suggested in the title also comes from the FHS, Filesystem Hierarchy Standard, that we partially support. Following that standard, we have a single binary directory, one lib directory per ABI/userland (“lib” being the main ABI), and share and include directories that contain platform-independent data files and headers.

Yes because in a perfect world the includes are just the same in all the platforms, so one has not to think about them too much. Unfortunately we all know it’s not true. Starting form glibc, that comes with headers that are completely unreadable by humans with all their preprocessor conditionals, but still are different (quite different) from arch to arch; then there are linux-headers, that of course changes betweena rchitecture for what regards the asm directory. But this is not limited to those two system packages, there are also packages that install autoconf-generated config.h files, that are fo course dependent on the userland.

The result is that we have to use some dirty trick by having three copies of every header for glibc in Gentoo/AMD64: one for the x86 version, one for the x86_64 version, one file that includes one or the other based on what the compiler is defining. This is a mess of its own, and I think somehow two include directories would have been simpler.

But it does not end here. On UNIX systems there’s classically a /usr/libexec directory that contains the commands called by other software, but not required to be visible to users, being them normal users or root (commands not visible to users but visible to root are in sbin). That directory is usually replaced by /usr/$libdir/misc in Gentoo, which means on multilib setup we have two distinct libraries. This works for most cases, but there can be one big fault in the logic.

Take CUPS, that installs usually in /usr/lib/cups all its executable files for filters, backends and similar. CUPS open them as pipes, it does not care what they are, they can be compiled, interpreted scripts, java programs, as long as the system can run them, CUPS does not care. If we were to follow the /usr/$libdir/misc reasoning, as we did for a while, one wouldn’t be able for instance to have a 64-bit CUPS working with a Canon iPixma backend (that needs to be compiled 32-bit as Canon does not release some of their libraries as 64-bit), although it supposedly work correctly; for this reason on 1.2 series I was able to push for using /usr/libexec for it, even if it “broke” what base-system team usually do.

We need it to share the executables for different ABI, and this is right, the program stored in /usr/$libdir/misc are executed, not loaded, so they can be used by different ABIs just the same way. The reasoning for the misc directory to be different was given to me as “it’s a step forward to have two ABIs completely installed on a system”, but seems like a stupid thing to do to me. We don’t have, nor probably want, two distinct sets of bin directories, because we can exec them as we want. If we really want them to be differentiated, well, then we have to came up with something better than /usr/lib/bin and /usr/lib64/bin or /usr/bin and /usr/bin64.

For what concerns libexec, its content should be always available with the best ABI possible, so either native or the one that takes less overhead to use. Think of a Gentoo/FreeBSD AMD64 box, /usr/libexec should have 64-bit FreeBSD binaries, but if for one binary (is it a generic command or a CUPS backend) those are not available it can fallback to i386 FreeBSD binaries, if neither are available, then it can install i386 Linux binaries. Of course being able to use three different ABIs require having three different libraries sets, and three different headers sets (as a lot of headers changes between FreeBSD and Linux), but then we can have only one command set, as we can mix and match different ABIs as long as the loader is not invoked, but rather a pipe is used.

A final alternative would be to have entirely different trees for the different ABIs, like /32bit/{bin,lib} and /64bit/{bin.lib} with /usr/{32,64}bit stuff and all, but leaving /usr/libexec as a share point. But I don’t think this is a good idea on the long run.

Anyway, I’d really like if Portage would have full true multilib support, but that will probably require quite a lot of work the way it is. I proposed an alternative method that might be simpler to handle, but requires rewriting almost the full support written right now so I doubt it makes much sense after all 🙂

Oh well back to work now that gdb finally installed on the iBook (I need it for the course), and I can test that gdbserver works as intended. Too bad they (the ones for whom I work) can’t use it as they work with an embedded ARM which environment was designed with a very old version of GDB not supporting TCP connections.

Share this: