The why and how of RPATH

This post is brought to you by a conversation with Fabio, which actually reminded me of an older conversation I had with someone else (exactly whom, right now, escapes me) about the ability to inject RPATH values into already-built binaries. I’m sorry to have forgotten who asked me about that, I hope he or she won’t take it bad.

But before I go to the core of the problem, let me try to give a brief introduction of what we’re talking about here, because jumping straight to talk about injection is going to be too much for most of my readers, I’m sure. Even though, this whole topic reconnects with multiple other smaller topics I discussed about in the past on my blog, so you’ll see a few links here and there for that.

First of all, what the heck is RPATH? When using dynamic linking (shared libraries), the operating system need to know where to look for the libraries an executable uses; to do so, each operating system has one or more PATH variables, plus eventual configuration files, that are used to look up the libraries; on Windows, for instance, the same PATH variable that is used to find the commands is used to load the libraries; and the libraries are looked for in the same directory where the executable is first of all. On Unix, the commands and the libraries use distinct paths, by design, and the executable’s directory is not searched for; this is also because the two directories are quite distinct (/usr/bin vs /usr/lib as an example). The GNU/Linux loader (and here the name is proper, as the loader comes out of GLIBC — almost identical behaviour is expected by uClibc, FreeBSD and other OSes but I know that much for sure) differs extensively from the Sun loader; I say this because I’ll introduce the Sun system later.

In your average GNU/Linux system, including Gentoo, the paths where to look up the libraries in are defined in the /etc/ld.so.conf file; prepended to that list, the LD_LIBRARY_PATH variable is used. (There is a second variable, LDPATH LIBRARY_PATH, that tells the gcc frontend to the linker where to look for libraries to link to at build time, rather than to load — update: thanks to Fabio who pointed me I got the wrong variable; LDPATH is used by the env-update script to set the proper search paths in the ld.so.conf file ). All the executables will look in all these paths for both the libraries they link to directly and for non-absolute dlopen() calls. But what happens with private libraries — libraries that are shared only among a small number of executables coming from the same package?

The obvious choice is to install them normally in the same path as the general libraries; this does not require playing with the search paths at all, but it causes two problems: the build-time linker will still find them during link time, and it might not be what you want and it increases the number of files present in the single directory (which means accessing its content slows down, little by little). The common alternative approach is installing it in a sub-directory that is specific to the package (automake already provides a pkglib installation class for this type of libraries). I already discussed and advocated this solution so that internal libraries are not “shown” to the rest of the software on the system.

Of course, adding the internal library directories to the global search path also means slowing down the libraries’ load, as more directories are being searched when looking for the libraries. To solve this issue, runpath was introduced first, and then improved with rpath. The DT_RPATH is a .dynamic attribute that provides further search paths for the object it is listed in (it can be used both for executables and for shared objects); the list is inserted in-between the LD_LIBRARY_PATH environment variable and the search libraries from the /etc/ld.so.conf file. In that paths, there are two special cases: $ORIGIN is expanded to be the path of the directory where the object is found, while both empty and . values in the list are meant to refer to the current work directory (so-called insecure rpath).

Now, while rpath is not a panacea and also slightly slow down the load of the executable, it should have decent effects, especially by not requiring further symlinks to switch among almost-equivalent libraries that don’t have ABIs that stable. They also get very useful when you want to build a software you’re developing so that it loads your special libraries rather than the system one, without relying on wrapper scripts like libtool does.

To create an RPATH entry, you simply tell it to the linker (not the compiler!), so for instance you could add to your ./configure call the string LDFLAGS=-Wl,-rpath,/my/software/base/my/lib/.libs to build and run against a specific version of your library. But what about an already-linked binary? The idea of using RPATHs make it also for a nicer handling of binary packages and their dependencies, so there is an obvious advantage in having the RPATH editable after the final link took place… unfortunately this isn’t as easy. While there is a tool called chrpath that allows you to change an already-present RPATH, and especially to delete one (it comes handy to resolve insecure rpath problems), it has two limitations: the new RPATH cannot be longer than the previous one, and you cannot add a new one from scratch.

The reason is that the .dynamic entries are fixed-sized; you can remove an RPATH by setting its type to NULL, so that the dynamic loader can skip over it; you can edit an already-present RPATH by changing the string it points to in the string table, but you cannot extend neither .dynamic nor the string table itself. This reduces the usefulness of RPATH for binary package to almost nothing. Is there anything we can do to improve on this? Well, yes.

Ali Bahrami at Sun already solved this problem in June 2007, which means just over three years ago. They implemented it with a very simple trick that could be implemented totally in the build-time linker, without having to change even just a bit of the dynamic loader: they added padding!

The only new definition they had to add was DT_SUNW_STRPAD, a new entry in the .dynamic table that gives you the size of the padding space at the end of the string table; together with that, they added a number of extra DT_NULL entries in the same .dynamic table. Since all the entries in .dynamic are fixed sized, a DT_NULL can become a DT_RPATH without a problem. Even if some broken software might expect the DT_NULL at the end of the table (which would be wrong anyway), you just need to keep them at the end. All the ELF software should ignore the .dynamic entries they don’t understand, as long as they are in the reserved specific ranges, at least.

Unfortunately, as far as I know, there is no implementation of this idea in the GNU toolchain (GNU binutils is where ld is). It shouldn’t be hard to implement, as I said; it’s just a matter of emitting a defined amount of zeros at the end of a string table, and add a new .dynamic tag with its size… the same rpath command from OpenSolaris would probably be usable on Linux after that. I considered porting this myself, but I have no direct need for it; if you develop proprietary software for Linux, or need this feature for deployment, though, you can contact me for a quote on implementing it. Or you can do it yourself or find someone else.

And now the s★★t hits the (MySQL) fans…

My Italian reader might have read an “article” of mine dated back into 2005 (yes it’s a PDF… at the time I had no blog, I only posted a couple of things around, and this one has been written in LaTeX; one day I’ll replace it with a blog page instead, probably). In that article I put in words my feelings that the heavy dependency on MySQL for the Free Software community would have been a problem on the long run.

My reason to write that was that the dual-licensing would have most likely caused gripe and problems if patches weren’t accepted to be released for the dual license, and that a fork of MySQL would easily become incompatible with itself, causing interoperability problems between versions. No, I didn’t write this last week and back-dated it; there are witnesses who read it, and disagreed, back then. On the other hand, I feel like I nailed the problem quite well.

Indeed with the acquisition of Sun by Oracle, quite a few people seem to worry about the destiny of MySQL, and at least one major fork started.

Now, I have quite a few technical gripes with MySQL; partly because I had to fight hard with it when it was still in version 3 and it was quite limited compared to PostgreSQL, partly because I had recently to try fixing its definitely stupid build system based on autotools (calling that autotools is quite an offence in my opinion). So my usual setup makes use of PostgreSQL instead (both this blog and xine’s bugzilla use it). I had unfortunately to deal with MySQL in the recent past as well but that’s fortunately gone now. So myself I’m not really worried of what’s going to happen.

I guess that one thing that I should be glad about is that we’re not still in 2005. Back at the time, most of the web-application that went to be used were very specifically written to work on MySQL. If MySQL was to be forked, like it’s happening now, at the time, then we would have had probably a major problem at our hands. Nowadays, most of the applications, for good or bad, use ORM libraries that tend to be written in such a way that the underlying database is not called directly (I say “for good or bad” because sometimes I get to hate the ORMs quite badly; but on the whole I can see why the abstraction is necessary — no I don’t think that we should all be learning how a CPU work to write software; while that helps it’s no longer strictly necessary).

Anyway, I hope that whatever the outcome, this is going to be a lesson for everybody out there never to rely on any specific piece of code: there is never anything telling what upstream might end up doing… be either killing his wife or selling to the probably biggest company working in the same software area.

Knowing your loader, or why you shouldn’t misuse env.d files

I’m not sure why that happens, but almost every time I’m working on something, I find something different that diverges my attention, providing me with more insight on what is going wrong with software all over. This is what causes me to have a TODO list that continuously grows rather than stopping for a while.

Today, while I went on working on my dependency checker script which for now does not even exist, but for which I added a few more functions to Ruby-Elf (you can check them out if you wish), I’ve started noticing some interesting problems with the library search path of my tinderbox chroot.

The problem is that I got 47 entries in the /etc/ld.so.conf file, which means 47 paths that needs to be scanned, and yet I have two entries crated by the env files as LD_LIBRARY_PATH and yet I find bundled libraries with the Portage bash trigger that my collision detection script misses. This let me understand there is more to that than I have been seeing up to now, and got me to dig deeper.

So I checked out some of the entries in the ld.so.conf file, some of them are due to multiple slot per library for different library versions, like QCA does. This is one decent way to handle the problem, by making two libraries differing just by soname in different directories, and passing the proper -L flag to get one or the other. A much saner alternative would be to ensure that two libraries with different API would get different version names, just like glib 2.0 gets a libglib-2.0.so name, but that’s a different point altogether.

Some other entries are more worrisome. For instance, the cuda packages install all their libraries in /opt/cuda/lib, but the profiler installs there also Qt4 libraries, causing them to appear on the system library list, even though with smaller priority than the copy as installed by the Qt ebuilds. Similar things happen with the binary packages of Firefox and Thunderbird, since I have the two of them together with xulrunner, adding their library path to the system library path.

Instead of doing this, there are a few packages that install the libraries outside the library path, and then use a launcher script, whose main use is setting the LD_LIBRARY_PATH variable to make sure the loader can find their libraries. This is what, for instance, /usr/bin/kurso does. Indeed if you start the kurso (closed-source) executable directly you’re presented with a failure:

# /opt/kurso/bin/kurso3
/opt/kurso/bin/kurso3: symbol lookup error: /opt/kurso/bin/kurso3: undefined symbol: initPAnsiStrings

The symbol in question is defined in the libborqt library as provided by Kurso itself, which is loaded through dlopen() (and thus does not appear in neither ldd nor scanelf -n output, just so I remember you of this problem). I could now write a bit around the amount of software written using Kylix and the amount of copies of libborqt in the system, each with its own copy of libpng and zlib, but that’s beside the point for now. The problem is that it indeed does not work by just starting it directly, and thus a wrapper file is created.

These wrappers are a necessary evil for proprietary closed-source software, but are quite obnoxious when I see them for Free Software that just gets build improperly; that’s the case both for the binary packages of Firefox and similar and the source packages, since my current /usr/bin/firefox contains this:

#!/bin/sh
export LD_LIBRARY_PATH="/usr/lib64/mozilla-firefox"
exec "/usr/lib64/mozilla-firefox"/firefox "$`"

This has a minor evil in it (it overwrites my current LD_LIBRARY_PATH settings) and a nastier one, it really isn’t much of a good idea because then also the programs it will call will get the same value, unless it’s unset later on. In general this is not a tremendously bright idea since we do have a method, the use of DT_RUNPATH in ELF files, that allows us to tell an executable to find the libraries in a directory that is not in the path. This is tremendously useful, and should really be used much more, since setting LD_LIBRARY_PATH also means skipping over the loader cache to find the libraries for all the processes involved rather than just for the (direct) dependency of a single binary. You can see that this has been a problem for Sun too.

For what it’s worth, the OpenSolaris linker, provides special padding space since two years ago to allow changing the runpath of executable files you don’t have access to (like the kurso binary above). It’s too bad that we have to deal with software lacking such padding because it would be quite useful too.

Is this all? Not really no, since you also get env.d entries for packages like quagga that seems to install some libraries in a sub-directory of the standard library directory and then adding that to the set of scanned directories. I haven’t investigated much about it yet, this blog post was intended as just a warning call, and I’ll probably try to work further on this to make it work better, but in general if your package installs internal “core” libraries, that you rightly don’t want into the standard library path (to avoid polluting the search path), you should not add it to ld.so.conf through environment files, but should rather use the -rpath option during link to tell it to find the library in a different path.

For developers, if your package does some funny thing with its libraries and you’re not sure if you’re handling it properly, may I just ask you please to tell me about it? I’d be glad to look into it, and eventually post a full explanation of how it works in the blog so that it can be used as reference for the future. I also have a few very nasty issues to blog about on Firefox and xulrunner, but I’ll save those for another time, for now.

The road to a Free Java is still very long

If you’ve been reading my blog for a while, you know I was really enthusiastic for Sun opening the JDK so that we could have Free Java. This should have solved both the Java trap and the Java crap (the false sense of portability among systems), and I was really looking forward for it.

When Sun finally made OpenJDK sources available, the Gentoo Java team was already on the field, and we were the first distribution packaging it, albeit in an experimental overlay that most users wouldn’t want to try. I also tried to do my best to improve the situation submitting buildsystem fixes to Sun itself to get it to build on my term, which means system libraries, --as-needed support and so on. The hospital set me back so I couldn’t continue my work to have something working, so at the end I gave up. Too bad.

After I came home I discovered that the IcedTea idea seemed to work fine, and the project was really getting something done, cool! But it wasn’t ready for prime time yet, so I decided to wait; I tried getting back on track last summer, but hospital set me back again, so I decided to not stick around too much, being out of the loop.

But since I stopped using Konqueror (with the rest of KDE) and moved to Firefox I’m missing Java functionality, since I’m on AMD64 and I don’t intend to use the 32-bit Firefox builds. So I decided to check out IcedTea6, based on OpenJDK 6 (that is, the codebase of the 1.6 series of Sun JDK, which should be much more stable). IcedTea6 actually got releases out, 1.2, 1.3, 1.3.1 now. Even though Andrew seems to be optimistic, this is not working just yet.

First problem: while OpenJDK only bootstrapped with Sun JDK 1.7 betas, IcedTea6 only bootstraps with IcedTea itself or another GNU classpath based compiler, like gcj or cacao. Okay so I merged gcj-jdk and used that one; IcedTea6 fails when I force --as-needed through compiler specs. Since the build system is quite too much of a mess for me I didn’t want to try fixing that just yet, I just wanted it to work, so I disabled that and restarted build. With gcj-jdk it fails to build because of a problem with source/target settings. Okay I can still use cacao.

The first problem with cacao is that gnu-classpath does not build with the nsplugin USE flag enabled since it does not recognise xulrunner-1.9, there is a bug open for that but no solution just yet. I disable that one and cacao builds, although IcedTea6 fails later on with internal compiler error. Yuppie!

And this is not just the end of the Odissey; I want to get Freemind to work on Yamato since I’ve tried it out and it really seems cool, but I can get it to work only on OS X for now, since to build some of its dependencies on Gentoo I need a 1.4 JDK, but on AMD64 there is no Sun JDK 1.4, no Blackdown, just Kaffe… and Kaffe 1.1.7 (the latest is 1.1.9), which is not considered anything useful or usable (and indeed it fails to build the dependency here).

I think the road is still very long, and very tricky. And I need to get a Java minion to help me finding what the heck of the problem it is!

Ruby-Elf and multiple compilers

I’ve written about supporting multiple compilers; I’ve written about testing stuff on OpenSolaris and I have written about wanting to support Sun extensions in Ruby-Elf.

Today I wish to write about the way I’m currently losing my head to get Ruby-Elf testsuite to apply over other compilres beside GCC. Since I’ve been implementing some new features for missingstatic upon request (by Emanuele “exg”), I decided to add some more tests, in particular considering Sun and Intel compilers that I decided to support for FFmpeg at least.

The new tests not only apply the already-present generic ELF tests (but rewritten and improved, so that I can extend them much more quickly) over files built with ICC and Sun Studio under Linux/AMD64, but also adds tests to check for the nm(1)-like code on a catalogue of different symbols.

The results are interesting in my view:

  • Sun Studio does not generate .data.rel sections, it only generates a single .picdata section, which is not divided between read-only and read-write (which might have bad results with prelinking);
  • Sun Studio also emits uninitialised non-static TLS variables as common symbols rather than in .tbss (this sounds like a mistake to me sincerely!);
  • the Intel C Compiler enables optimisation by default;
  • it also optimises out unused static symbols with -O0;
  • and even with __attribute__((used)) it optimises out static uninitialised variables (both TLS and non-TLS);
  • oh and it puts a “.0” suffix to the name of unit-static data symbols (I guess to discern between them and function-static symbols, that usually have a code after them);
  • and least but not last: ICC does not emit a .data.rel section, nor a .picdata section: everything is emitted in .data section. This means that if you’re building something with ICC and expect cowstats to work on it, then you’re out of luck; but it’s not just that, it also means that prelinking will not help you at all to reduce memory usage, just a bit to reduce startup time.

Fixing up some stuff for Sun Studio was easy, and now cowstats will work fine even under Sun Studio compiled source code, taking care of ICC quirks not so much, and also meant wasting time.

On the other hand, there is one new feature to missingstatic: now it shows the nm(1)-like symbol near the symbols that are identified as missing the static modifier, this way you can tell if it’s a function, or constant, or a variable.

And of course, there are two manpages: missingstatic(1) and cowstats(1) (DocBook 5 rulez!) that describe the options and some of the working of the two tools; hopefully I’ll write more documentation in the next weeks and that’ll help Ruby-Elf being accepted and used. Once I have enough documentation about it I might actually decide to release something. — I’m also considering the idea of routing --help to man like git commands do.

Ruby-Elf and Sun extensions

I’ve written in my post about OpenSolaris that I’m interested in extending Ruby-Elf to parse and access Sun-specific extensions, that is the .SUNW_* sections of ELF files produced under OpenSolaris. Up to now I only knew the format, and not even that properly, of the .SUNW_cap section, that contains hardware and software capabilities for an object file or an executable, but I wasn’t sure how to interpret that.

Thanks to Roman, who sent me the link to the Sun Linker and Libraries Guide (I did know about it but I lost the link to it quite a long time ago and then I forgot it existed), now I know some more things about Sun-specific sections, and I’ve already started implementing support for those in Ruby-Elf (unfortunately I’m still looking for a way to properly test for them, in particular I’m not yet sure how I can check for the various hardware-specific extensions — also I have no idea how to test the Sparc-specific data since my Ultra5 runs FreeBSD, not Solaris). Right at the moment I write this, Ruby-Elf can properly parse the capabilities section with its flags, and report them back. Hopefully, with no mistakes, since only basic support is in the regression test for now.

One thing I really want to implement in Ruby-Elf is versioning support, with the same API I’m currently using for GNU-style symbol versioning. This way it’ll be possible for ruby-elf based tools to access both GNU and Sun versioning information as it was a single thing. Too bad I haven’t looked up yet how to generate ELF files with Sun-style versioning support. Oh well, it’ll be one more thing I’ll have to learn. Together with a way to set visibility with Sun Studio, to test the extended visibility support they have in their ELF extended format.

In general, I think that my decision of going with Ruby for this is very positive, mostly because it makes it much easier to support new stuff by just writing an extra class and hook it up, without needing “major surgery” every time. It’s easy and quick to implement new stuff and new functions, even if the tools will require more time and more power to access the data (but with the recent changes I did to properly support OS-specific sections, I think Ruby-Elf is now much faster than it was before, and uses much less memory, as only the sections actually used are loaded). Maybe one day once I can consider this good enough I’ll try to port it to some compiled language, using the Ruby version as a flow scheme, but I don’t think it’s worth the hassle.

Anyway, if you’re interested in improving Ruby-Elf and would like to see it improve even further, so that it can report further optimisations and similar things (like for instance something I planned from the start: telling which shared objects for which there’s a NEEDED line are useless, without having to load the file trough ld.so to use the LD_* variables), I can ask you one thing and one thing only: a copy of Linkers and Loaders that I can consult. I tried preparing a copy out of the original freely available HTML files for the Reader but it was quite nasty to see, nastier than O’Reilly freely-available eBooks (which are bad already). It’s in my wishlist if you want.

Supporting more than one compiler

As I’ve written before, I’ve been working on FFmpeg to make it build with the Sun Studio Express compiler, under Linux and then under Solaris. Most sincerely, while supporting multiple (free) operating systems, even niche Unixes (like Lennart likes to call them) is one of the things I spend a lot of time on, I have little reason to support multiple compilers. FFmpeg on the other hand tends to support compilers like the Intel C Compiler (probably because it sometimes produces better code than the GNU compiler, especially when coming to MMX/SSE code — on the other hand it lacks some basic optimisation), so I decided to make sure I don’t create regressions when I do my magic.

Right now I have five different compile trees for FFmpeg: three for Linux (GCC 4.3, ICC, Sun Studio Express), two for Solaris (GCC 4.2 and Sun Studio Express). Unfortunately the only two trees to build entirely correctly are GCC and ICC under Linux. GCC under Solaris still needs fixes that are not available upstream yet, while Sun Studio Express has some problem with libdl under Linux (but I think the same applies to Solaris), and explodes entirely under Solaris.

While ICC still gives me some problems, Sun Studio is giving me the worst headache since I started this task.

While Sun seems to strive to reach GCC compatibility, there are quite a few bugs in their compiler, like -shared not really being the same as -G (although the help output states so). Up to now the most funny bug (or at least absurd idiotic behaviour) has been the way the compiler handles libdl under Linux. If a program uses the dlopen() function, sunc99 decides it’s better to silently link it to libdl, so that the build succeeds (while both icc and gcc fail since there is an undefined symbol), but if you’re building a shared object (a library) that also uses the function, that is not linked against libdl. It remembered me of FreeBSD’s handling of -pthread (it links the threading library in executables but not in shared objects), and I guess it is done for the same reason (multiple implementation, maybe in the past). Unfortunately since it’s done this way, the configure will detect dlopen() not requiring any library, but then later on libavformat will fail the build (if vhook or any of the external-library-loading codecs are enabled).

I thus reported those two problems to Sun, although there are a few more that, touching some grey areas (in particular C99 inline functions), I’m not sure to treat as Sun bugs or what. This includes for instance the fact that static (C99) inline functions are emitted in object files even if not used (with their undefined symbols following them, causing quite a bit of a problem for linking).

The only thing for which I find non-GCC compilers useful is to take a look to their warnings. While GCC is getting better at them, there are quite a few that are missing; both Sun Studio and ICC are much more strict with what they accept, and raise lots of warnings for things that GCC simply ignores (at least by default). For instance, ICC throws a lot of warnings about mixing enumerated types (enums) with other types (enumerated or integers), which gets quite interesting in some cases — in theory, I think the compiler should be able to optimise variables if they know they can only assume a reduce range of values. Also, both Sun Studio, ICC, Borland and Microsoft compilers warn when there is unreachable code in sources; recently I discovered that GCC, while supporting that warning, disables it by default both with -Wall and -Wextra to avoid false positives with debug code.

Unfortunately, not even with the combined three of them I’m getting the warning I was used to on Borland’s compiler. It would be very nice if Codegear decided to release an Unix-style compiler for Linux (their command-line bcc for Windows does have a syntax that autotools don’t accept, one would have to write a wrapper to get those to work). They already released free as in soda compilers for Windows, it would be a nice addition to have a compiler based upon Borland’s experience under Linux, even if it was proprietary.

On the other hand, I wonder if Sun will ever open the sources of Sun Studio; they have been opening so many things that it wouldn’t be so impossible for them to open their compiler too. Even if they decided to go with CDDL (which would make it incompatible with GCC license), it could be a good way to learn more things about the way they build their code (and it might be especially useful for UltraSPARC). I guess we’ll have to wait and see about that.

It’s also quite sad that there isn’t any alternative open source compiler focusing, for instance, toward issuing warnings rather than optimising stuff away (although it’s true that most warnings do come out of optimisation scans).

So, what am I doing with OpenSolaris?

I’ve written more than once in the past weeks about my messing with OpenSolaris, but I haven’t explained very well why I’m doing that, and what exactly is that I’m doing.

So the first thing I have to say is that since I started getting involved in lscube I focused on getting the buildsystem in shape so that it could be more easily built, especially with out-of-tree builds, which is what I usually do since I might have to try the build with multiple compilers (say hi to the Intel C Compiler and Sun Studio Express). But since then, I only tested it under Linux, which is quite a limitation.

While FreeBSD is reducing tremendously the gap it had against GNU/Linux (here it’s in full, since I intend Linux and glibc together), OpenSolaris has quite a few differences from it, which makes it an ideal candidate to check for possible GNUisms creeping into the codebase. Having the Sun Studio compiler available too makes it also much simpler to test with non-GCC compilers.

Since the OpenSolaris package manager sucks, I installed Gentoo Prefix, and moved all the tools I needed, included GCC and binutils, in that. This made it much easier to deal with installing the needed libraries and tools for the projects, although some needed some tweaking too. Unfortunately there seems to be a bug with GNU ld from binutils, but I’ll have to check if it’s present also in the default binutils version or if it’s just Gentoo patching something wrong.

While using OpenSolaris I think I launched quite a few nasty Etruscan curses toward some Sun developers for some debatable choices, the first being, as I’ve already extensively written about, the package manager. But there has been quite a few other issues with libraries and include files, and the compiler itself.

Since feng requires FFmpeg to build, I’ve also spent quite a lot of time trying to get FFmpeg to build on OpenSolaris, first with GCC, then with Sun Studio, then again with GCC and a workaround with PIC: the bug I noted above with binutils is that GNU ld doesn’t seem to be able to create a shared object out of object not compiled with PIC, so it requires -fPIC to be forced on for them to build, otherwise the undefined symbols for some functions, like htonl() become absolute symbols with value 0 which cause obvious linking errors.

Since I’ve been using OpenSolaris from a VirtualBox virtual machine (which is quite slow even though I’m using it without Gnome, using SSH to have a login, and jumping inside the Gentoo prefix installation right away), I ended up trying to first build FFmpeg with the Sun Studio compiler taken from Donnie’s overlay under Linux, with Yamato building with 16 parallel processes. The problem here is that the Sun Studio compiler is quite a moving target, to the point that a Sun employee, Roman Shaposhnik , suggested me on ffmpeg-devel to try Sun Studio Express (which is, after all, what OpenSolaris has too), that should be more similar to GCC than the old Sun Studio 10 was. This is why dev-lang/sunstudioexpress is in portage, if you didn’t guess it earlier.

Unfortunately even with the latest version of Sun Studio compiler, building FFmpeg has been quite some trouble. I ended up fighting quite a bit with the configure script and not limited to that, but luckily, now most of the patches I have written have been sent to ffmpeg-devel (and some of them accepted, others I’ll have to rewrite or improve depending on what the FFmpeg developers think about them). The amount of work needed just to get one dependency to work is probably draining up the advantage I had in using Gentoo Prefix for those dependencies that work out of the box with OpenSolaris.

(I’ll probably write about the FFmpeg changes more extensively as they deserve a blog entry on their own , and actually the drafts for the blog entries I have to write starts to pile up just as much as the entries in my TODO list.)

While using OpenSolaris I also started understanding why many people hate Solaris this much; a lot of command spit out errors that don’t let you know at all what the real problem is (for instance if I try to mount a nfs filesystem with a simple mount nfs://yamato.local/var/portage /imports/portage, I get an error telling me that the nfs path is invalid; on the other hand, the actual error here is that I need to add -o vers=2 to request NFSv2 (why it doesn’t seem to work with v3 is something I didn’t want to investigate just yet). Also, the OpenSolaris version I’m using, albeit it’s described as “Developer Edition”, lacks a lot of man pages for the library functions (although I admit that most of those which are present are very clear).

In addition to the porting I’ve written about, I’ve also taken the time to extend the testsuite of my ruby-elf, so that Solaris ELF files are better supported; it is interesting to note that the elf.h file from OpenSolaris contain quite more definitions about that, I haven’t yet looked at the man pages to see if Sun provide any description about the Sun-specific sections, for which I’d also like to add further parsers classes. It has been interesting since neither the GNU nor the Sun linkers set the ELF ABI property to values different from SysV (even though both Linux and Sun/Solaris have an ABI value defined in the standard), and GNU and Sun Sections have some overlapping values (like the sections used for symbol versioning: glibc and Solaris have different ways to handle those, but the section type ID used for both is the same; the only way to discern between the two is the section name).

At the end, to resolve the problem, I modified ruby-elf not to load the sections at once, but just on request, so that by the time most sections are loaded, the string table containing the sections’ names is available. This allows to know the name of the section, and thus discern the extended sections by name rather than ABI. Regression tests have been added so that the sections are loaded properly for different elf types too. Unfortunately I haven’t been able to produce a static executable on Solaris with neither the Sun Studio compiler nor GCC, so the only tests for the Solaris ELF executables are for dynamic executables. Nonetheless, the testsuite for ruby-elf (which is the only part of it to take up space: out of 3.0MB of space occupied by ruby-elf, 2.8MB are for the tests) reached 72 different tests and 500 assertions!

OpenSolaris Granularity

If you follow my blog since a long time ago you know I had to fight already a couple of time with Solaris and VirtualBox to get a working Solaris virtual machine to test xine-lib and other sotware on.

I tried again yesterday to get one working, since Innotek was bought by Sun, VirtualBox support for Solaris improved notably, to the point they now have a different network card emulated by default, that works with Solaris (that has been the long-standing problem).

So I was able to install OpenSolaris, and thanks to Project Indiana I was able to check which packages were installed, to remove stuff I don’t need and add what I needed. Unfortunately I think the default granularity is a bit concerning. Compiz on a virtual machine?

The first thing I noticed is that an update of a newly-installed system with the last released media requires to download almost the size of the whole disk in updates, the disk is a simple 650MB CD image, and the updates were over 500MB. I suppose this is to be expected, but at that point, why not pointing to some updated media by default, considering updating is far from being trivial? Somehow I was unable to perform the update properly with the GUI package manager, and I had to use the command-line tools.

Also, removing superfluous packages is not an easy task, since the dependency tracking is not exactly the best out there: it’s not strange for a set of packages not to be removed because some of them are dependencies… of one of them being removed (this usually seems to be due to plugin-ins; even after removing the plugins, it’d still cache the broken dependency and disallow me from removing the packages).

It’s not all here of course, for instance to find the basic development tools in their package manager is a problem of its own; while if you look for “automake” it will find a package named SUNWgnu-automake, if you look for “autoconf” it will find nothing; the package is called SUNWaconf. I still haven’t been able to find pkg-config, although the system installs .pc files just fine.

I guess my best bet would be to remove almost everything out the system from their own package manager and decide to try prefixed-Portage, but I just haven’t had the will to look into that just yet. I hope it would also help with the version of GCC that Sun provides (3.4.3).

I got interested back into Solaris since, after a merge of Firefox 3.0.2, I noticed cowstats throwing up an error on an object file, and following to that, I found out a couple of things:

  • cowstats didn’t manage unknown sections very well;
  • Firefox ships with some testcases for the Google-developed crash handler;
  • one of these testcases is an ELF ET_EXEC file (with .o extension) built for Solaris, that reports a standard SysV ABI (rather than a Sun-specific one), but still contains Sun-specific sections;
  • readelf from Binutils is not that solid as its homologue from Elfutils.

Now cowstats should handle these corner-cases pretty well, but I want to enrich my testcases with some Solaris objects. Interestingly enough, in ruby-elf probably 80% of the size of an eventual tarball would be taken up by test data rather than actual code. I guess this is a side-effect of TDD, but also exactly why TDD-based code is usually more solid (every time I find an error of mine in ruby-elf, I tend to write a test for it).

Anyway, bottom line: I think Project Indiana would have been better by adapting RPM to their needs rather than inventing the package manager they invented, since it doesn’t seem to have any feature lacking in Fedora, but it lacks quite a bit of other things.

Sub-optimal optimisations?

While writing Implications of pure and constant functions I’ve been testing some code that I was expecting to be optimised by GCC. I was surprised to find a lot of my testcases were not optimised at all.

I’m sincerely not sure whether these are due to errors on GCC, to me expecting the compiler to be smarter than it can feasibly be right now, or to the “optimised” code to be more expensive than the code that is actually being generated.

Take for instance this code:

int somepurefunction(char *str, int n)
  __attribute__((pure));

#define NUMTYPE1 12
#define NUMTYPE2 15
#define NUMTYPE3 12

int testfunction(char *param, int type) {
  switch(type) {
  case 1:
    return somepurefunction(param, NUMTYPE1);
  case 2:
    return somepurefunction(param, NUMTYPE2);
  case 3:
    return somepurefunction(param, NUMTYPE3);
  }

  return -1;
}

I was expecting in this case the compiler to identify cases 1 and 3 as identical (by coincidence) and then merge them in a single branch. This would have made debugging quite hard actually (as you wouldn’t be able to discern the two case) but it’s a nice reduction on code, I think. Neither on x86_64 nor on Blackfin, neither 4.2 nor 4.3 actually merge the two cases leaving the double code in there.

Another piece of code that wasn’t optimised as I was expecting it to be is this:

unsigned long my_strlen(const char *str)
  __attribute__((pure));
char *strlcpy(char *dst, const char *str, unsigned long len);

char title[20];
#define TITLE_CODE 1
char artist[20];
#define ARTIST_CODE 2

#define MIN(a, b) ( a < b ? a : b )

static void set_title(const char *str) {
  strlcpy(title, str, MIN(sizeof(title), my_strlen(str)));
}

static void set_artist(const char *str) {
  strlcpy(artist, str, MIN(sizeof(artist), my_strlen(str)));
}

int set_metadata(const char *str, int code) {
  switch(code) {
  case TITLE_CODE:
    set_title(str);
    break;
  case ARTIST_CODE:
    set_artist(str);
    break;
  default:
    return -1;
  }

  return 0;
}

I was expecting here a single call to my_strlen(), as it’s a pure function, and in both branches it’s the first call. I know it’s probably complex code once unrolled, but still gcc at least was better at this than intel’s and sun’s compilers!

Both Intel’s and Sun’s, even at -O3 level, emit four calls to my_strlen(), as they can’t even optimise the ternary operation! Actually, Sun’s compiler comes last for optimisation, as it doesn’t even inline set_title() and set_artist().

Now, I haven’t tried IBM’s PowerPC compiler as I don’t have a PowerPC box to develop on anymore (although I would think a bit about the YDL PowerStation, given enough job income in the next months — and given Gentoo being able to run on it), so I can’t say anything about that, but for these smaller cases, I think GCC is beating other proprietary compilers under Linux.

I could check Microsoft’s and Borland’s Codegear’s compilers, but it was a bit out of my particular scope at the moment.

If I did think a bit before about supporting non-GNU compilers for stuff like xine and unieject, I start to think it’s not really worth the time spent on that at all, if this is the result of their compilations…