Not all failures are caused equal

Ryan seem to think of me of inconsistent for promoting Portage failures when _FORTIFY_SOUCE points out a sure fortification error. The related bug is getting more useful now than it was when it was started, but shows not only some misunderstanding of my position regarding breakages and QA in general.

So first of all, let me repeat (because I said that before) that I prefer build-time failures to run-time failures. Why? It’s not difficult to understand, when you think about it, it covers at least my three different use cases:

In all these three cases, I’m bothered by build failures, but they aren’t much of a problem; if I’m building something in a pinch is a nuisance, in other cases, upgrade or new build, I usually have time to look up what is broken before it gets a real problem. If any package is failing at runtime, though, it’s much worse than a nuisance. If Apache dies, then I have to downgrade and go rebuild another one quickly; if the router’s SSH daemon crashes, I have to go downstairs with the laptop and access the serial console; and if at that moment the picocom tool I use to access the serial console doesn’t start, then I’m simply going to cry.

So with this situation at hand, can you blame me for preferring stricter compilers and build managers (don’t forget that Portage and its cousins are not just package managers à-la APT, but also build managers à-la FreeBSD Ports) that actually disallows installation of broken code? This is the same reason why we added --no-undefined to Ruby — rather than hiding the problems under the hood and leaving the car to melt down on failure, we notice the leak beforehand, and disallow it from ever leaving the garage!

As Mike pointed out and I exemplified on the bug linked above, even if the code is, at a bird’s eye, perfectly fine, because it relies on a number of assumptions about how the compiler and the C libraries work, if the compiler is reporting a sure failure, it is going to fail. This is the same reason why, with GCC 4.5, we had a runtime breakage of GNU tar, as it tried to fill two adjacent strings in a structure with a single copy, rather than two. It wasn’t going to create an actual overflow problem, but it was stepping over the final character; not only the compiler warned about it but… the C library aborted at runtime!

Now you could argue that it means that they are not discerning between hacky-but-good solutions to improve performance, and security-vulnerable code. But in truth, you have to take a standing at some point, and it makes total sense to actually be as safe as possible, especially in today’s world where security should be that much of a concern and performances can be easily improved with faster hardware.

This should cover my reasons to be in favour of dying when the code provides a path where the C library is going to abort at runtime (and it’s not properly controller). What about my actual criticising the unmasking of glibc and other software before, when they caused failures? Well it is also two-folded. From one side, glibc 2.12 does not only causes build-time failures; as I explained there are enough indication that software could abort at runtime when an header is missing in the sources; but whatever the problem is, let’s look at it from a need perspective; what is the reason for this failure to be introduced? Simply to clean up the code, and make it faster to build (less headers are less files to load and parse), so you force users to include what they need, rather than include a long chain of dependent headers, it’s a laudable target, but mostly one of optimisations and enhancement.

On the other hand, fortification features are used to increase security by mitigating possible vulnerabilities in the software design. This wide difference in the root reason for the breakage alone is what make them different in my personal assessing of the problem. And because of this, I think there is a case for Portage aborting right away in ~arch for packages with known overflows, even though I maintain that glibc 2.12 and similar problems shouldn’t have been left loose on the users. The former does not make it more broken, it makes it safer!

And tomorrow, “Is fixing implicit declarations for me?”

Exit mobile version