Is fixing implicit declarations for me?

As promised yesterday today’s post is about implicit declarations. Let’s first give a bit of background of those, even though I have partially described them with GLIBC 2.12 since they seem to have increased with that release.

As the name already explains, implicit function declarations are caused when you call a function without declaring it beforehand. This behaviour has been classically allowed by the C language, but is usually not allowed by most other compiled languages, starting from C’s younger cousins such as C++. Since a declaration provides the specification of the type of return values and parameters, how does that work? Simple: any function used without declaration is declared as int func(...).

Is this a bad situation? You bet; as Mike pointed out the main issue there is with the size of parameters and, more importantly, of the return value. For instance, if the function is supposed to return a pointer, it will not work, and rather truncate it, on any 64-bit system; and today’s systems are mostly 64-bit! There is also the problem of the compiler being unable to validate the needed number and order of parameters. More importantly in my opinion is that an implicit declaration can point out for an undefined symbol when a function (or macro) has been mistyped, or removed from the used libraries.

So implicit declarations are bad, why are they not always disallowed then? Well, they are, on C++, as I said; they are not in C for various reasons; and while you can make the compiler refuse them with -Werror-implicit-declarations (and on more modern compilers -Werror=implicit-declarations) you cannot do that on a distribution level because things like autoconf actually rely on them to be allowed, ouch.

So In this situation, should Gentoo developers care about fixing these warnings themselves? Yes and no; from one side, fixing them adds a tighter safety to the code, and that is something that we could always use; on the other side, they tend not to be much of a problem if the package works at the end of the day; while we love to make things better, we should try to stick with upstream as much as possible; diverging even just for this kind of stuff is never good.

But there is something more to consider here; as I wrote twice we should be caring more about fortified functions. Fortified functions are there to make our programs more secure, by making sure that we don’t have buffer overflows that could lead to security issues further down the road. I have also noted that since version 4.3.3-r1 of GCC in Gentoo, all you have to do to enable use of fortified source is using -O2 (fortified functions are enabled only if function inlining is enabled, otherwise they will be ignored). Actually it’s a bit more complex.

The fortified functions have a chk specification, and are separate symbols from the “standard” common functions; this means that, for instance, you shouldn’t have a reference to strncpy to have the fortified function but rather __strncpy_chk. The translation between these two names is done by the header files, when optimisations and inline functions are supported. This, though, only works if the header files where the function is declared are actually included. This means that leaving the declarations implicit also leave you uncovered from both the build-time and run-time control of fortified sources, which let me tell you, is bad news.

But this should also act as a warning: fixing implicit declarations is not just a build-time fix! It might even change the behaviour of the program, adding possible aborts at runtime when the code actually has overflows that generally are not fatal. Is that a bad thing? Not really, in my opinion, but others might disagree. In any case, it’s enough of a change to warrant a revision bump, to me. Of course to avoid that the easiest thing to do is … not having implicit declarations the first time you add a package to the tree, or version bump it.

I guess I’ll have to doublecheck my own packages, and possibly ask again Zac to allow for per-package features so that I can make sure that this kind of stuff never makes it on my final commits.

Not all failures are caused equal

Ryan seem to think of me of inconsistent for promoting Portage failures when _FORTIFY_SOUCE points out a sure fortification error. The related bug is getting more useful now than it was when it was started, but shows not only some misunderstanding of my position regarding breakages and QA in general.

So first of all, let me repeat (because I said that before) that I prefer build-time failures to run-time failures. Why? It’s not difficult to understand, when you think about it, it covers at least my three different use cases:

  • I have two remote vservers running Gentoo; I cannot build on them (and neither should you!), but I keep a local chroot environment where I pre-build binary packages for the software I need; I update the environment once daily, and if it gets special security concerns; when I do that, I build the packages and upload them to the servers, which can take quite a bit of time as my upload is around 40 KiB/s;
  • also, I connect my whole house (my computers, my mother’s, and a bunch of various devices and appliances; plus the guests’ devices) via not a standard SoHo router, but a much more flexible Gentoo box, running basically “embedded” on a Celeron board with only SSH and serial console access; I update the “firmware” of the system once in a blue moon, mostly for security concerns, by flashing a CF card that is connected to the EIDE bus; also here I use a chroot environment to build the packages;
  • finally, I have a laptop, also running Gentoo; this one is updated more irregularly, because there are days I don’t even touch it; I also try not to build stuff when I’m on the run because I might not have enough bandwidth to download the sources (and while connected to my local network I access them directly).

In all these three cases, I’m bothered by build failures, but they aren’t much of a problem; if I’m building something in a pinch is a nuisance, in other cases, upgrade or new build, I usually have time to look up what is broken before it gets a real problem. If any package is failing at runtime, though, it’s much worse than a nuisance. If Apache dies, then I have to downgrade and go rebuild another one quickly; if the router’s SSH daemon crashes, I have to go downstairs with the laptop and access the serial console; and if at that moment the picocom tool I use to access the serial console doesn’t start, then I’m simply going to cry.

So with this situation at hand, can you blame me for preferring stricter compilers and build managers (don’t forget that Portage and its cousins are not just package managers à-la APT, but also build managers à-la FreeBSD Ports) that actually disallows installation of broken code? This is the same reason why we added --no-undefined to Ruby — rather than hiding the problems under the hood and leaving the car to melt down on failure, we notice the leak beforehand, and disallow it from ever leaving the garage!

As Mike pointed out and I exemplified on the bug linked above, even if the code is, at a bird’s eye, perfectly fine, because it relies on a number of assumptions about how the compiler and the C libraries work, if the compiler is reporting a sure failure, it is going to fail. This is the same reason why, with GCC 4.5, we had a runtime breakage of GNU tar, as it tried to fill two adjacent strings in a structure with a single copy, rather than two. It wasn’t going to create an actual overflow problem, but it was stepping over the final character; not only the compiler warned about it but… the C library aborted at runtime!

Now you could argue that it means that they are not discerning between hacky-but-good solutions to improve performance, and security-vulnerable code. But in truth, you have to take a standing at some point, and it makes total sense to actually be as safe as possible, especially in today’s world where security should be that much of a concern and performances can be easily improved with faster hardware.

This should cover my reasons to be in favour of dying when the code provides a path where the C library is going to abort at runtime (and it’s not properly controller). What about my actual criticising the unmasking of glibc and other software before, when they caused failures? Well it is also two-folded. From one side, glibc 2.12 does not only causes build-time failures; as I explained there are enough indication that software could abort at runtime when an header is missing in the sources; but whatever the problem is, let’s look at it from a need perspective; what is the reason for this failure to be introduced? Simply to clean up the code, and make it faster to build (less headers are less files to load and parse), so you force users to include what they need, rather than include a long chain of dependent headers, it’s a laudable target, but mostly one of optimisations and enhancement.

On the other hand, fortification features are used to increase security by mitigating possible vulnerabilities in the software design. This wide difference in the root reason for the breakage alone is what make them different in my personal assessing of the problem. And because of this, I think there is a case for Portage aborting right away in ~arch for packages with known overflows, even though I maintain that glibc 2.12 and similar problems shouldn’t have been left loose on the users. The former does not make it more broken, it makes it safer!

And tomorrow, “Is fixing implicit declarations for me?”

Some _FORTIFY_SOURCE far-fetched warnings are… funny!

So the new Portage releases finally kills when sure-enough overflows are built and that’s for me good news: we’ll finally get to fix these problems rather than ignoring them until they hit a security issue.

But then, Kumba reported a failure in libtirpc, which I was the last one to play with, to fix NFS with Kerberos — as you can understand, it was something I felt I had to fix myself, since I might as well have been the cause.

Interestingly, the problem I could reproduce on my user system, but not on the tinderbox; but that was enough for me to have an idea where to start, so I finally was able to reduce the problem to this synthetic test case out of the libtirpc 0.2.1 sources:

#include 
#include 
#include 

int main(int argc, char *argv[]) {
  struct sockaddr_in addr = {
    .sin_family = AF_INET,
    .sin_port = 0
  };
  struct hostent *hp;

  if ((hp = gethostbyname(argv[1])) == NULL)
    return -1;

  if (hp->h_length > sizeof(addr))
    hp->h_length = sizeof(addr);

  memcpy(&addr.sin_addr.s_addr, hp->h_addr, (size_t)hp->h_length);

  return 0;
}

Interestingly enough, if you compile it with the minimum flags required for _FORTIFY_SOURCE diagnostics to be emitted – gcc -O2 -Wall test.c – you won’t find any warning in your output; there isn’t a sure-fire overflow in that case. So let’s look deeper at the code.

On the highest level you can look at the code, there just isn’t space for any overflow: we’re calling gethostbyname() which happens to only return IPv4 addresses, which are always 32-bit; so the h_length member of hostent is never going to be anything else than 4 (size in bytes of the address). So the code is, by any mean, safe.

If you look at the actual code much harder, you can notice that there is an extra failsafe: in any case you’re going to copy at most sizeof(addr) bytes… and here’s the catch; addr is a sockaddr structure, and as all of the other sockaddr structures, start with a single, 16-bit value indicating the actual address family. Additionally, what you’re actually going to copy is the IP address itself, (addr.sin_addr.s_addr which is neither the first nor the second member of addr, but it’s following the actual port. This does not only mean that the offset of s_addr is 8, but also that the size of the complete structure is 8, not 4, so the example code above can be rewritten as:

#include 
#include 
#include 

int main(int argc, char *argv[]) {
  struct sockaddr_in addr = {
    .sin_family = AF_INET,
    .sin_port = 0
  };
  struct hostent *hp;

  if ((hp = gethostbyname(argv[1])) == NULL)
    return -1;

  memcpy(&addr.sin_addr.s_addr, hp->h_addr, MAX(hp->h_length, 8));

  return 0;
}

It should be now much easier to identify where the problem is: you can have hp->h_length to be 4 (or less — more to follow) or it is between 5 and 8 inclusive. But once again, GCC is not complaining it with -O2, because it is not surely an overflow for the compiler, as the value is actually variable: it can be included in either the 0-8 range, or between 0x80000000-0xFFFFFFFF (the negative range). What is causing it to understand there is a dangerous situation? An old friend of mine, actually: -ftracer.

The description from the man page is not really something you look out for, but here it is:

Perform tail duplication to enlarge superblock size. This transformation simplifies the control flow of the function allowing other optimizations to do better job.

What it means is that it somehow allows different branches to duplicate their code, allowing for further optimisations of corner cases. I have reported before that this was causing a build failure in MPlayer last year, because a segment of handwritten inline asm was using global labels rather than local labels, and while it was present on an already-joined branch, gcc -ftracer was splitting it in two to optimise it better; if a single copy of the asm was there, the global labels worked just as fine, but once it was inlined twice, you’d have a conflict of labels. It was trivial to fix but took a while to track it down.

Now, thanks to the help of -S -O2 -ftracer -fverbose-asm I could tell what was going on in the code by looking at the generated asm; if I try to translate that back to C code (you don’t really want to look at x86 asm code when trying to track down a logic bug!), the result would be something very similar to this (it would be closer if I used goto instructions, but I’d rather not):

#include 
#include 
#include 

int main(int argc, char *argv[]) {
  struct sockaddr_in addr = {
    .sin_family = AF_INET,
    .sin_port = 0
  };
  struct hostent *hp;

  if ((hp = gethostbyname(argv[1])) == NULL)
    return -1;

  if (hp->h_length >= sizeof(addr))
    memcpy(&addr.sin_addr.s_addr, hp->h_addr, sizeof(addr));
  else
    memcpy(&addr.sin_addr.s_addr, hp->h_addr, (size_t)hp->h_length);

  return 0;
}

And indeed, if you build that code with gcc -O2 -Wall, it will report the same overflow warning.

My first look at fixing this was only fixing the obvious problem: it should have compared h_length with addr.sin_addr.s_addr rather than just addr, so it would compare to 4, not 8… and that’s what I first committed; then I actually thought a bit more at the code, and noted that there are quite a few problems with code such as that:

  • if the address is reported as IPv6 (although I admit I’m not sure if gethostbyname() does that at all!), there is no way that the such-reduced address works at all; for my own IPv6 subnet, it would report 2002:5e5f which is not even telling my full IPv4 address (it is a 6-to-4 mapping);
  • if the address has a size shorter than 32-bit, it’ll cause an overflow on the other side, as more data is copied from the hostent array than is present — do note that it it’s not a security problem or a possible buffer overflow, as the returned array is NULL-terminated, so it either reads garbage from the following entries, or it reads the 0-bytes of the NULL termination; any recent architecture has 32-bit sized NULLs as far as I know;
  • everything can break if the hostent result is actually providing a negative length; in such case you’d be reading a huge number of bytes, and causing a segfault at one point or another; I don’t think that case is going to happen ever though, so it shouldn’t be of concern.

Given these details, I updated the patch during the same sync cycle, so that if the length is not the same between the hostent and addr.sin_addr.s_addr, the getrpcport() call will simply fail.

Is the bug exploitable? I don’t think so but I wouldn’t bet my life on it.

I’m not extremely sure if it’s the client or the server to call getrpcport(), but it is going to ask the question to a nameserver and, from there, it means that you’re working on untrusted data. If gethostbyname() reports AF_INET6 hosts as well, an attacker could rewrite the four bytes below the addr object; if the code is built with optimisations turned on, that’s the only object on the stack, and I have sincerely no idea whether it’s the parameters or the return value that’s present at that point on the stack; if it’s the parameters, it shouldn’t really matter at all, return value might be a bit trickier, but on 64-bit systems it wouldn’t be able to do much good beside possibly causing either server or client to crash by jumping on an non-executable memory area.

At any rate, I hope you now have a bit more idea of why I’m interested that this stuff is actually looked at properly, and not simply ignored.

Are you ready for the next phase?

While the LDFLAGS testing is still not complete, and if we are to properly fix any and all of the LDFLAGS-related problems it’s going to take a few years still, I’ve got an idea for the next phase of bugs to report through the tinderbox — and started reporting the first few bugs on the matter; thankfully Kapcer also started fixing them already which at least make it possible for me to feel not too useless.

As you can see from this bug, the kind of problems I’m now looking at are those reported by the _FORTIFY_SOURCE feature of recent GCC/GLIBC; I’ve been stirred up by the recent article on security mitigation strategies that made Gentoo Hardened appear a very shiny option, compared to other more frequently used distribution — to be honest, I don’t know many people running a Gentoo Hardened desktop, outside the general circle of developers, while at least both Ubuntu and Fedora don’t seem to focus extremely on desktops.

As you can see by yourself on that article, there is a feature that is applied to almost all processes: _FORTIFY_SOURCE; to be precise, it is the one feature that is not limited to Gentoo Hardened but is rather applied to standard Gentoo as well. It’s easy to enable that because it is implemented entirely on user space – there is no need to run a special kernel – nor it requires heavy changes to the behaviour that might cause further bugs (like the old SSP system did). Gentoo have been enabling it by default since GCC 4.3.3-r1 — at least as long as -O2 is present, otherwise GLIBC will disable it and it won’t protect you; so don’t expect to get a more stable/secure system by disabling optimisations, okay?

The fact that it works without having to change the kernel at all is a particularly important thing at least to me: the server where this blog is hosted uses the vserver technology, so I have no control over which kernel is executed; given this, I cannot afford stronger mitigation techniques like PaX and strong ASLR (and given I lack strong ASLR, I don’t use PIE either).

Anyway, one thing that is not excessively well known is that _FORTIFY_SOURCE works at run time, but also increase the efficacy of the build-time analysers by providing warnings in case an overflow can be identified at build-time at all. Portage already points a light at these warnings by repeating them at the end of the merge if they happen, but I asked Zac to weigh them more, by considering them fatal. In the mean time I’ve decided I’ll report such bugs on our bugzilla, hopefully this should help improving the upstream quality and reduce the chances that we let security issues sneak in under our nose.

And here, I have to thank Sebastian and Zac for making it possible to get the maintainer’s data right away on a build log, so I could drop the silly hack I added to the tinderbox.

Again, this is not really fixing all the trouble, and it most definitely only ends up hitting basic situations where the problem is far from a security issues, but at a minimum it provides a basic sense of doing things the right way that is helpful in forming a security culture, or at least I hope so.

_FORTIFY_SOURCE, optimisations, and other details

While writing about the -O0 optimisation level I found some interesting problems related to the fortified source support in recent GCC and GLIBC releases, which we enable by default since GCC 4.3.3-r1 and that is used as a partial replacement of stack-smashing (which hopefully, thanks to Magnus, will soon come back into Gentoo!).

As I said in that post, the special fortified functions (those that are available in the form of __$func_chk in the libc.so file and provide warnings at build time, and proper stack traces at runtime) only get enabled if inline functions are enabled, so are totally ignored at -O0 (simply disabling inlines, by using -fno-inline won’t stop them from being used, though). This means that if we have some software that does not respect the CFLAGS variables, and uses -O0 in whatever context, it’s currently not using fortified sources, and as such, it can crash without warning about it beforehand.

But lack of optimisation – which is a quite rare occurrence, luckily – is not the only thing that may cause the fortified versions of functions to not be emitted. Since the fortified versions of the functions are declared (and the wrappers defined) in files such as /usr/include/bits/stdio2.h, you need to include them properly for them to work. Unfortunately, it’s way too common for projects not to take headers seriously and leave functions to be implicitly defined. But the implicit-definition does not take care of the fortified sources.

On the other hand, during testing of these little differences, I found some things that aren’t obvious at all. For instance, when you do use fortified sources, GCC fails to optimise some slightly more sophisticated cases. For instance this is the code I used to test implicit declarations:

#ifndef IMPLICIT
# include 
#endif

int main() {
  char foo[12];
  return sprintf(foo, "ABCDEFGHIJKLMNOPQRSTUVWXYZ");
}

When built with -DIMPLICIT -O1 (so causing sprintf() to be implicitly defined), the produced executable does not crash, while it does crash with -O0 (see why I say -O0 produces different code?). The reason for that is that GCC picks up the call to sprintf() as a built-in function, whose semantic is known to it. In general, GCC cannot drop function calls that even just might have side effects, but in this case, GCC knows what the side-effect is (writing to the foo variable); its parameters are also known, so it can replace the function with its effects straight during the build phase. But since the foo variable is never read from, the whole copy is a moot effect. The only thing that we care about the sprintf call is the return value, which represent the length of the content written to the string, and is also constant since the compiler knows would be 26.

Indeed, the emitted code for that function, compiled that way is as such:

main:
        movl    $26, %eax
        ret

Interesting, isn’t it? Now this has two related side effects: the first is that the fortified function is not used, so there is no warning printed at build-time, the other is that the code will not crash at runtime because the buffer overflow is gone altogether! Now, can this be considered a bug in GCC? I don’t think so, it’s the code that is wrong to begin with. But you can see how disabling optimisations has now introduced a crash site. You can see how the code would otherwise work by adding -fno-builtin to the compiler. This will remove the semantic knowledge of the call from the compiler optimisers; this results in a straight call to sprintf().

But this post need to have at least a bottom line to be worth anything, so here there are some:

  • Gentoo treasures stability and security, for this reason why enable fortified sources by default; if what you desire is pure speed, without caring about safety, you may -D_FORTIFY_SOURCE=0 to your CFLAGS variable; this will override the fortified sources as specified by the Gentoo spec file; if you do so, though, please do not file bugs unless you provide a patch with them as well;
  • even with the tinderbox at hand, I have near to no way to find out whether the fortified functions are used or not; as far as I can tell, the fortified version is not emitted when the important parameters are not build-time constant (obviously);
  • one way to at least reduce the impact of possibly skipped fortified checks is to get rid of implicit declarations; Portage already takes care of reporting them as QA warnings at the end of the merge; unfortunately, most maintainers won’t appreciate patches to remove implicit declarations because they are usually boring, and require to be changed from release to release, if not properly sent upstream; if you care about security and safety, please do take care of those warnings; if you’re not the maintainer, send them upstream, asking them nicely to add them to the next release;
  • even if we don’t have warnings about implicit declarations, there is the risk that some stupid software decides to declare the sysem functions manually rather than relying on those provided by the C library; this usually happens when the software is trying to be portable but wants to feel smarter than average; goes without saying that it becomes a mess to identify that software and properly fix it up.