Mailbox: about the debug USE flag

I’m back discussing the backtrace guide that I wrote some time ago. Fredrik asks by mail:

[…] after reading the bt-article I’m starting to think that the debug USE, more or less simply enables the developers own debugging stuff (like noisy extra output during run-time, assertions (which, afaik, are removed by -DNDEBUG which isn’t that uncommon in “release mode” cf. “debug mode”), etc).

So, my question is simply: if I want debugging symbols I should stay away from the debug USE and only add “-ggdb” to my CFLAGS/CXXFLAGS and put splitdebug in my FEATURES?

The answer is not as simple as I’d like it to be. Fredrik is definitely right into assessing that the debug USE flag, at least in most cases, is used to enable “debug mode” builds. Assertions, debug verbosity, extra safety checks, … all this kind of stuff, that is useful for developers during debug session, and totally useless (or even harmful) for production builds, can be tied to the debug USE flag. But, since we’re Gentoo we cannot do things this cleanly. Unfortunately.

While the QA team, as far as I know, agrees on the meaning of debug USE flag, not all developers seem to accept that, so you may well have multiple meanings of it right now:

Or most likely, a combination of the above. Note that the official meaning has been, for a long time, the following:

debug – Enable extra debug codepaths, like asserts and extra output. If you want to get meaningful backtraces see http://www.gentoo.org/proj/en/qa/backtraces.xml

The reason why the QA team would like for the USE flag to only handle the first case is that the other two make little to no sense in the context of Gentoo. Even though you ask the build system to emit debug symbols with -ggdb, without setting the proper features into portage, they’ll be stripped off before merge to live filesystem (and don’t even think of using RESTRICT=strip!); toying with CFLAGS is also a very bad idea because you’re second-guessing the user (and we have enough ways to handle per-ebuild CFLAGS to deal with that). Enabling the debug code is instead something that only an USE flag can do cleanly.

Unfortunately it’s not just the opinion of the QA team or of the maintainers that are on the table here: some upstream developers, like the Drizzle developers, seem not to understand that backtraces can be valid even when using optimisation (at least up to -O2), or that you can have useful information out of a backtrace even without enabling assertions and similar problems. Actually, to find out why something crashed in production, you want the backtrace to be produced without the debug code, otherwise you’re debugging two different programs!

Now, I sincerely think that the description for the debug USE flag is definitely too generic; but you cannot give more details about it without becoming specific to one package or one set of packages; fortunately me and Doug solved that some time ago – over two and a half years ago to be precise – by allowing per-package extended USE flags descriptions in metadata.xml files. And indeed there are quite a few packages that do document it. In one case (app-portage/eix) even admitting they are doing it wrong.

I could write a whole post about how I would expect app-portage packages to be a notch over the rest of the tree, given that are packages written specifically for Gentoo, and should know and comply with the policies we set out for packages. The fact that at least two fails to do that, is a bad sign.

It is important to write some notes about the -g flags here, before people get confused. You can neither always disable them nor always enable them in Gentoo itself. The latter is a common assumption I read people thinking on: “since Gentoo already strips the debug information, why not always emitting it and leave the user just to tinker with the stripping features?”, well the answer is actually easy to understand; emitting the debug information is not a trivial task, as the compiler needs to write more data to the assembler that needs to compile it into the DWARF code (and Arnaldo from the dwarves/perf projects has shown me how messed up that is, trust me, it is a lot). It might sound very easy when you look at very simple source files, but it becomes tremendously complex when you have thousand-lines source files or, worse, C++ source files with templates. Not only it’s a lot of work for the assembler (and linker) but it also increases the size of the temporary object files (the products of compile+assembly) and of the final shared objects and executable files, before stripping; if you build in RAM, this means you’ll need more memory to store the build directory; if you build on disk, it increases the amount of stuff that gets written to and read from the disk, slowing down the process quite a bit in some cases.

At the same time, we cannot use (or suggest the use of) -g0 to disable the whole debug information generation. While the idea looks nice, and I also thought about it a bit, I wouldn’t be surprised if some packages actually made use of basic debug information; of course in those cases you’d also have RESTRICT=strip (and there are a number of those packages in the tree already). So you might want to try it by yourself but.. you’re on your own if you shoot yourself in the foot with that.

One could ask here why the debug USE flag is so package-specific, and why one shouldn’t just use -DNDEBUG to disable all the asserts to begin with. The answer is that, unfortunately, a huge amount of software fails to work when assertions are disabled, because they consider their asserts as the main error handling feature, while asserts should really just be used to verify situations that will not happen. Unfortunately, once you disable those asserts, they have no graceful handling of errors, so the software might crash badly or – much worse – corrupt the data before crashing. So before deciding whether you can add the debug USE flag, you got to find out whether the package supports doing so.

Also, you might have to note that not all the assertions are disabled by defining NDEBUG; for instance, g_assert() from glib is disabled via another macro (which if I recall correctly should be G_ASSERT_DISABLE or something along those lines); but even that might not be a good idea to disable, unless you know what you’re doing).

I sincerely try my best to have both the upstream packaging and the ebuild abiding by this rule: --enable-debug in feng adds some further debug messages, and adds function that fully free the resources before exiting (useful to remove false positives from valgrind), while --disable-debug removes all of that and all the assertions.

At any rate, the bottom line is that, as usual, Gentoo lacks consistency in its design and decisions; we really should start applying this kind of consistency to new packages as they are added, but I fear most reviewers, including the Sunrise reviewers, will forget about these details. I’m not asking perfection here, but at least having a clue?

Exit mobile version