Fellow developer Zhang Le wrote about the new preserve-libs feature from Portage 2.2 that removes the need for revdep-rebuild.
As I wrote on gentoo-dev mailing list when Marius asked for comments, there are a few prolems with its implementation as it is, in my not-so-humble opinion. (Not-so-humble because I know exactly what I’m talking about, and I know it’s a problem).
Let’s take a common scenario, a system where --as-needed
as never used, that is updating a common library from ABI 0 to ABI 1 (so with a change of soname). This library might be, for instance, libexpat
.
I don’t want to discuss here what an ABI is and what an ABI bump consists of. Let’s just say that when you make an ABI bump you either remove functions, or you change the meaning of some functions (like the parameters, the behaviour or other things like those).
In the first case the bump is annoying but not much of a problem, executables stop being loaded because symbols are undefined; with lazy-loaded executables, they might die in the middle at the moment the undefined symbols is called, but that’s not our concern here.
The problem comes when a function with the same exact name changes meaning, parameters or return type. In this case, the executable might pass too much or too little data to the function, he pointers might be referring to something completely different, or might be truncated. In general, when you change the interface or the meaning of a function, if the executables built to use the previous version are executed with the new version, they’ll either crash down or behave in a corrupted manner. Which are two subtle issues which we should be looking forward to, as they are hard to debug unless you know about them.
So let’s return to our library changing ABI. Let’s say we have libfooA.so.0
and libfooA.so.1
installed, the first is preserved by preserved-libs, the second is the new one. libfooB.so.$anything
links to libfooA
as it uses it directly, so it will be in the set of packages to rebuild.
Introducing libfooC.so.$anything
that links to libfooB.so.$anything
, but as destiny wishes, is also using libfooA
.
At this point before the ABI bump we have libfooB
depending on libfooA.0
, and libfooC
depending on libfooB
and libfooA.0
; after the bump, we decide to rebuild only libfooB
, which means that libfooB
now depends on libfooA.1
while libfooC
is still depending on libfooA.0
.
What this means is that, minus symbol versioning, the same symbol would have two (probably different) definitions, which will collide one with the other, leading to subtle crashes, misbehaviour and other fun-to-debug problems.
The problem is that the two ABIs of the libraries are both being loaded in the same userspace, which is a very bad thing, unless the symbols are versioned. On the other hand, symbol versioning is a bit of a mess, it’s not implemented by all operating systems, and I find it quite convoluted.
At the moment I don’t see anything in portage that stops you from shooting in your own foot by doing a partial rebuild. I hope I’m mistaken, but if I’m not, please remember to always do a full rebuild, rather than a partial one. Instead of having programs not starting, you might have programs corrupting your data, otherwise.
Wouldn’t virtually every user perform a full rebuild anyway? At which point the old library is removed and if anything was missed then the problem would be more obvious. I think the new behaviour in portage 2.2 is certainly a step in the right direction where we will hopefully have a lot less breakage resulting in crippled systems. At least we can then rebuild at our leisure instead of the night before giving a talk! It is good to discuss these issues, I am no expert in linking but I consider the new behaviour to be very beneficial.
A possible thing to do would be to move the preserved libs to a directory apart (which would still be in /etc/ld.so.conf) instead of the standards /lib and /usr/lib, so that we are sure that the linker would never link to it.FreeBSD’s ports does it by moving it to /usr/local/lib/compat/pkg, and it works quite well. (at least I never had a problem with it for now).
FreeBSD breaks ABI on every mini-change, which creates a HUUUUGE amount of libraries that have compatible interface but incompatible soname. So I wouldn’t count them as a good example, the problems might not happen because the libraries _are_ compatible.Beside, it’s not a problem of the linker usig the old libraries, but a problem of two libraries being linked in before hand.Marcus, I _hope_ people do their full rebuild immediately. But I know Murphy.But yeah I agree it’s a step forward anyway 🙂
How am I going to do a full rebuild? Which facility of Portage 2.2 informs me that a library had a soname bump, so that I need to rebuild it? I can no longer rely on revdep-rebuild for that, since the linking isn’t actually broken.In other words, my immediate reaction will be to do nothing at all. After all, everything still works.Then come the incremental updates. Portage doesn’t do deep updating by default, so my next emerge -u world will upgrade only packages directly mentioned in the world file, and dependencies if a higher version is required.This means that it is very likely that programA (after the update) links directly against libraryB.so.2 and indirectly to libraryB.so.1 through libraryC, which wasn’t updated because programA’s new version doesn’t depend on a higher version of libraryC than the old version.Also, if I must do a full rebuild on upgrading a library, what have I gained? OK, so a package might keep working until the complete rebuild is over, whereas without preserve-libs it will break during the update process. Big deal. The expat debacle was pretty much the only place where that mattered, and this new feature is the wrong solution to the problem.So basically, I consider this all a mistake. It doesn’t remove the problem (a full rebuild is needed or things will break), it makes the obvious solution harder (revdep-rebuild won’t work anymore), and it makes the problem more subtle (things will break possibly weeks after the update, not immediately).The expat problem should have been solved by being able to mark a library as build-critical. Libraries marked such force an automatic revdep-rebuild after an soname update, done by portage itself, before continuing merges. Because – since kdeconfig and qtconfig at least, and I think even pkgconfig depend on expat – the upgrade broke all builds anyway, spreading chaos.
@Sebastian:With portage-2.2 you get bugged after every upgrade and some other cases with something like this:!!! existing preserved libs:>>> package: media-libs/mesa-7.1_rc1 * – /usr/lib64/libGLU.so.1.3 * – /usr/lib64/libGLU.so.1.3.070003Use emerge @preserved-rebuild to rebuild packages using these librariesSo that’s your facility of informing you.
The full rebuild is done through @preserve-rebuild that is what Portage actually tells you to do after a bump.What I agree with, though, is that Portage should consider doing the rebuild by itself by default, even if this turns out to be undeterministic.
So my instant reaction shouldn’t be for example “mv /usr/lib64/libGLU.so.1.3 /usr/lib64/old/* && revdep-rebuild” ? )
You could as well remove it as what you’re doing makes little to no sense…
It would be nice to have everything working all the time and in a safe way. But as I see it, there are only two ways to achieve that.1. symbol versioning, but as Flameeyes wrote: “On the other hand, symbol versioning is a bit of a mess, it’s not implemented by all operating systems, and I find it quite convoluted.”2. After emerging libfooA.so.1 you have to rebuild everything depending on libfooA.so.0 and install it in one atomic step. The reason is that rebuilding one package after another has no safe package ordering. On the other hand, I don’t know how to achieve real atomicity. Therefore it might be best first to rebuild everything depending on libfooA.so.0 without merging it, than removing libfooA.so.0 — breaking the system for a short time, but preventing “subtle crashes, misbehaviour and other fun-to-debug problems” — and finally merging the packages previously rebuilt.
Grr, messy.I have noticed inconsistencies like this in the past; one perfect example is enbling the unicode use flag which breaks vim and revdep-rebuild doesn’t fix the problem. the only way to solve the problem has been emerge -e world.On my server systems this is why I build everything in a chroot, whenever I do anything with portage it is done there. Adding packages, updates I perform the updates followed by 3X emerge -e world then sync the packages to my other servers.