This Time Self-Hosted
dark mode light mode Search

Respecting CFLAGS and CXXFLAGS, reality testing!

As I have written in my post Flags and flags, I think that one way out of the hardened problem would be to actually respect the CFLAGS and CXXFLAGS the user requests so that they actually apply to the ebuilds. Unfortunately, not all the ebuilds in the tree respect the flags, and finding out which ones do and which ones don’t hasn’t been, up to now, an easy task.

There are many reasons for this, the most common one is to look at the build output and spot that all the compile lines lack your custom flags, but this is difficult to automate, another option is to inject a fake definition option (-DIWASHERE) and grep for it in the build logs, but this is messed up if you consider that a package might ignore CFLAGS just for a subset of its final outputs.

While I was without Enterprise I spent some time thinking about this and I came to find a possible solution, which I’m going to experiment on Yamato, starting tonight (which is Friday 29th for what it’s worth).

The trick is that GCC provides a flag that allows you to include an extra file, unknown to the rest of the code. With a properly structured file, you can easily inject some beacon that you can later pick up.

And with a proper beacon injected in the build files, it shouldn’t be a problem to check using scanelf or similar tools if the flags were respected.

The trick here is all in the choice of the beacon and in looking it up; the first requirement for the proper beacon is to make sure it does not intrude and disrupt the code or the compilation, this means it has to have a name that is not common, and thus does not risk to collide with other pieces of code, and won’t clash between different translation units.

To solve this, the name can be just very long so that it’s impractical that somebody might have used it for a funciton or variable name, let’s say we call that beacon cflags_test_cflags_respected. This is the first step, but it still doesn’t solve the problem of clashing traslation units. If we were to write it like this:

const int cflags_test_cflags_respected = 1234;

two translation units with that in them, linked together, will cause a linker error that will stop the build. This cannot happen or it’ll make our test useless. The solution is to make the symbol a common symbol. In C, common symbols are usually the ones that are declared without an initialisation value, like this:

int cflags_test_cflags_respected;

Unfortunately this syntax doesn’t work on C++, as the notion of common symbol hasn’t crossed that language barrier. Which means that we have to go deeper in the stack of languages to find the way to create the common symbol. It’s not difficult, once you decide to use the assembly language:

asm(".comm cflags_test_cflags_respected,1,1");

will create a common symbol of size 1 byte. It won’t be perfect as it might increase the size of .bss section for a program by one byte, and thus screw up perfect non-.bss programs, but we’re interested in the tests rather than the performance, as of this moment.

There is still one little problem though: the asm construct is not accepted by the C99 language, so we’ll have to use the new one instead: __asm__, that works just in the same way.

But before we go on with this, there is something else to take care of. As I have written in the entry linked at the start of this one, there are packages that mix CFLAGS and CXXFLAGS. As we’re here, it could be easy to just add some more test beacons that track down for us if the package has used CFLAGS to build C++ code or CXXFLAGS to build C code. With this in mind, i came to create two files: flagscheck.h and flagscheck.hpp, respectively to be injected through CFLAGS and CXXFLAGS.

flame@yamato ~ % sudo cat /media/chroots/flagstesting/etc/portage/flagscheck.h
#ifdef __cplusplus
__asm__(".comm cflags_test_cxxflags_in_cflags,1,1");
#else
__asm__(".comm cflags_test_cflags_respected,1,1");
#endif
flame@yamato ~ % sudo cat /media/chroots/flagstesting/etc/portage/flagscheck.hpp
#ifndef __cplusplus
__asm__(".comm cflags_test_cflags_in_cxxflags,1,1");
#else
__asm__(".comm cflags_test_cxxflags_respected,1,1");
#endif

And here we are, now it’s just time to inject these in the variables and check for the output. But I’m still not satisfied with this. There are packages that, mistakenly, save their own CFLAGS and propose them to other programs that are linked against; to avoid these to falsify our tests, I’m going to make the injection unique on a package level.

Thanks to Portage, we can create two functions in the bashrc file, pre_src_unpack and post_src_unpack, in the former, we’re going to copy the two header files in the ${T} directory of the package (the temporary directory), then we can mess with the flags variables and insert the -include command. This way, each package will get its own particular path; when a library passes the CFLAGS assigned to itself to another package, it will fail to build.

pre_src_compile() {
    ln -s /etc/portage/flagscheck.{h,hpp} "${T}"

    CFLAGS="${CFLAGS} -include ${T}/flagscheck.h"
    CXXFLAGS="${CXXFLAGS} -include ${T}/flagscheck.hpp"
}

After the build completed, it’s time to check the results, luckily pax-utils contains scanelf, which makes it piece of cake to check whether one of the four symbols is defined, or if none is (and thus all the flags were ignored), the one line function is as follow:

post_src_compile() {
    scanelf "${WORKDIR}" 
        -E ET_REL -R -s 
        cflags_test_cflags_respected,cflags_test_cflags_in_cxxflags,cflags_test_cxxflags_respected,cflags_test_cxxflags_in_cflags
}

At this point you just have to look for the ET_REL output:

ET_REL cflags_test_cflags_respected /var/tmp/portage/sys-apps/which-2.19/work/which-2.19/tilde/tilde.o 
ET_REL cflags_test_cflags_respected /var/tmp/portage/sys-apps/which-2.19/work/which-2.19/tilde/shell.o 
ET_REL  -  /var/tmp/portage/sys-apps/which-2.19/work/which-2.19/getopt.o 
ET_REL cflags_test_cflags_respected /var/tmp/portage/sys-apps/which-2.19/work/which-2.19/bash.o 
ET_REL  -  /var/tmp/portage/sys-apps/which-2.19/work/which-2.19/getopt1.o 
ET_REL cflags_test_cflags_respected /var/tmp/portage/sys-apps/which-2.19/work/which-2.19/which.o

And it’s time to find out why getopt.o and getopt1.o are not respecting CFLAGS while the rest of the build is.

Comments 5
  1. I think newish gcc versions can actually build the CFLAGS used directly into the binaries. Might be another way to approach it.

  2. Yes I thought about that, but the problem is that you’d have to inject a beacon anyway (because you can’t just assume that because it has, for instance @-march=barcelona -O2 -ftracer -pipe -ftree-vectorize -Wformat=2 -Wno-error -Wno-pointer-sign -g -ggdb -Wstrict-aliasing=2 -Wno-format-zero-length@ (yes these are my CFLAGS 😉 ) in CFLAGS it has picked them up from your CFLAGS variable _for that build_.A very good example of this is, unfortunately, perl: perl build respects your CFLAGS, but the problem is it saves them in its configuration files. On the other hand, all the Perl native extensions built using XS will ignore your CFLAGS settings for your build, and use the ones used to build perl. Which of course, if I use the technique described above, breaks because the header file is cleaned up after a successful build. I reported this in “bug #236200”:http://bugs.gentoo.org/show…, hopefully somebody knowing Perl and XS better than me can help out on that.Now while injecting just a dummy @-D@ switch might actually work, it does not allow a control as fine as the one with the two header files (about flags mixing). On the other hand that approach would be much more discreet, not causing build failures in case of injection (but then again, it’d make it more difficult to track down the source, I think), and not causing build failures when stuff like FFmpeg expects no extraneous symbols in a compiled object file (I think this has been fixed already though, but I’ll have to dig it up and I haven’t done so yet).There is at least one other complication with my approach though: when the compiler is used to build preprocessed assembler files, it will fail because of them asm() statement not being valid assembler. It’s just a matter of protecting the whole headers with an @#ifndef __ASSEMBLER__@ though.

  3. You could try defining a static const like:static const int long_variable_name = 1234;As I understand, the static keyword makes the declaration private to the compilation unit. This means it shouldn’t cause linking problems because of multiple symbol definitions.

  4. The problem with using static is that, as it’s local to the translation unit, it’s only present for @nm@ output as long as you don’t use optimisations. As soon as you turn on @-O1@, DCE will get rid of the symbol altogether.Even though I don’t turn on optimisations in the chroot, some packages do force that on, and will not have the symbol present on their translation units.

  5. But you could use “__atribute__ ((used)) ” at declaration.It says that variable is used, even though optimisator might think otherwise and instruct compiler to keep it.I have just tried on short test prog and it works even on -O3…

Leave a Reply to brankob@avtomatika.comCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.