For extra content about this entry, please refer to the previous one which talks about array of strings and PIC.
As I say in the other post, prelink can reduce the amount of dirty RSS pages due to COW of PIC code. As prelink assigns to every library a predefined load address in memory, which is either truly unique, or unique within the scope of the set of programs who are said to be able to load that library, there is no COW (Copy-on-Write) during load as the loader doesn’t have to change the address loaded from the file, and is thus able to share the same page with many processes. This is how prelink save (sometimes a lot of) memory.
Unfortunately there is one big problem especially with modern software architectures: many programs now use runtime-loaded plugins for functions; the whole KDE architecture is based on this, even for KDE 4, as well as xine and others and others.
The problem is that prelink can’t really take into account the plugins, as it doesn’t know about them. For instance it can’t understand that amarok is able to load the xine engine, thus amarokapp is not going to be prelinked for libxine.so. Additionally, it can’t understand that libxine.so is able to load the xineplug_decode_ff.so, which in turn depends on libavcodec.so. This means that for instance when using the
-m switch, it could be assigning libqt3-mt.so and libavcodec.so the same address.. causing a performance hit rather than improvement at runtime, when the linker will have to relocate all the code of libavcodec.so, triggering a COW.
The same is true for almost all scripting languages which use C-compiled extensions: Perl, Ruby, Python, as you can’t tell that the interpreter can load them just by looking at the ELF file, which is what prelink does.
A possible way around this is to define post-facto, that is after compilation, which shared object the program can load. It could probably be done through a special .note section in the ELF file and a modified prelink, but I’m afraid it would be quite difficult to properly implement it especially in ebuilds. On the other hand, it might give quite some performance improvement; as I said today’s software architecture are often based on on-demand loading of code through plugins, so it could be quite interesting.
But as I understand, having true PIC/PIE code has no or neglible impact on x86_64, so there is no real need for relocation…
It has negligible impact on speed performance, mostly, as the CPU has a nicest way to handle indirect relocations, and more importantly, x86-64 has enough registers so that using one up for PIC does not have much impact.On x86, using PIC means sacrificing ebx to addressing, which ain’t a little thing, as there are just a handful of registers in x86.But the use of extra memory pages is present on all architectures in these cases, as the relocation _has_ to be done.You might be confusing the generic relocations with text relocations, text relocations are changes in the code sections (.text) which cause a COW of the whole machine code section, those are bad, and happen when not using PIC. In fact, using PIC for libraries makes it possible to share more memory pages and thus use less memory in general. The big performance hit of PIC on x86 is due to the ebx register being used up.
Who cares for x86 ? It’s on its way out.I mean, it still deserves some attention but anything that requires some bigger infrastructure changes is not worth it IMHO.
The limitation I shown here has nothing to do with x86, it’s perfectly valid on AMD64 (which incidentally is what I use myself). And saving COW is a _great_ help on modern AMD64 too, without even coming to think about embedded systems.
What do you need COW for with (PC)relative code on x86_64 ?
Not for the code, but for .data.relro sections, see “this entry”:https://blog.flameeyes.eu/2… for more information about that.