Cleaning up after yourself

I have noted in my post about debug information that in feng I’m using a debug codepath to help me reduce false positives in valgrind. When I wrote that I looked up an older post, that promised to explain but never explained. An year afterwards, I guess it’s time for me to explain, and later possibly document this on the lscube documentation that I’ve been trying to maintain to document the whole architecture.

The problem: valgrind is an important piece of equipment in the toolbox of a software developer; but as any other tool, it’s also a tool; in the sense that it’ll blindly report the facts, without caring about the intentions the person writing the code had at the time. Forgetting this leads to situation like Debian SA 1571 (the OpenSSL debacle), where an “unitialised value” warning was intended like the wrong thing, where it was actually pretty much intended. At any rate, the problem here is slightly different of course.

One of the most important reasons to be using valgrind is to find memory leak: memory areas that are allocated but never freed properly. This kind of errors can make software either unreliable or unusable in production, thus testing for memory leak for most seasoned developers is among the primary concerns. Unfortunately, as I said, valgrind doesn’t understand the intentions of the developers, and in this context, it cannot discern between memory that leaks (or rather, that is “still reachable” when the program terminates) and memory that is being used until the program stops. Indeed, since the kernel will free all the memory allocated to the process when it ends, it’s a common doing to simply leave it to the kernel to deallocate those structures that are important until the end of the program, such as configuration structures.

But the problem with leaving these structures around is that you either have to ignore the “still reachable” values (which might actually show some real leaks), or you receive a number of false positive introduced by this practice. To remove the false positive, is not too uncommon to free the remaining memory areas before exiting, something like this:

extern myconf *conf;

int main() {
  conf = loadmyconf();
  process();
  freemyconf(conf);
}

The problem with having code written this way is that even just the calls to free up the resources will cause some overhead, and especially for small fire-and-forget programs, those simple calls can become a nuisance. Depending on the kind of data structures to free, they can actually take quite a bit of time to orderly unwind it. A common alternative solution is to guard the resource-free calls with a debug conditional, of the kind I have written in the other post. Such a solution usually ends up being #ifndef NDEBUG, so that the same macro can get rid of the assertions and the resource-free calls.

This works out quite decently when you have a simple, top-down straight software, but it doesn’t work so well when you have a complex (or chaotic as you prefer) architecture like feng does. In feng, we have a number of structures that are only used by a range of functions, which are themselves constrained within a translation unit. They are, naturally, variables that you’d consider static to the unit (or even static to the function, from time to time, but that’s just a matter of visibility to the compiler, function or unit static does not change a thing). Unfortunately, to make them static to the unit you need an externally-visible function to properly free them up. While that is not excessively bad, it’s still going to require quite a bit of work to jump between the units, just to get some cleaner debug information.

My solution in feng is something I find much cleaner, even though I know some people might well disagree with me. To perform the orderly cleanup of the remaining data structures, rather than having uninit or finalize functions called at the end of main() (which will then require me to properly handle errors in sub-procedures so that they would end up calling the finalisation from main()!), I rely on the presence of the destructor attribute in the compiler. Actually, I check if the compiler supports this not-too-uncommon feature with autoconf, and if it does, and the user required a debug build, I enable the “cleanup destructors”.

Cleanup destructors are simple unit-static functions that are declared with the destructor attribute; the compiler will set them up to be called as part of the _fini code, when the process is cleared up, and that includes both orderly return from main() and exit() or abort(), which is just what I was looking for. Since the function is already within the translation unit, the variables don’t even need to be exported (and that helps the compiler, especially for the case when they are only used within a single function, or at least I sure hope so).

In one case the #ifdef conditional actually switches a variable from being stack-based to be static on the unit (which changes quite a bit the final source code of the project), since the reference to the head of the array for the listening socket is only needed when iterating through them to set them up, or when freeing them; if we don’t free them (non-debug build) we don’t even need to save it.

Anyway, where is the code? Here it is:

dnl for configure.ac

CC_ATTRIBUTE_DESTRUCTOR

AH_BOTTOM([#if !defined(NDEBUG) && defined(SUPPORT_ATTRIBUTE_DESTRUCTOR)
           # define CLEANUP_DESTRUCTOR __attribute__((__destructor__))
           #endif
          ])

(the CC_ATTRIBUTE_DESTRUCTOR macro is part of my personal series of additional macros to check compiler features, including attributes and flags).

And one example of code:

#ifdef CLEANUP_DESTRUCTOR
static void CLEANUP_DESTRUCTOR accesslog_uninit()
{
    size_t i;

    if ( feng_srv.config_storage )
        for(i = 0; i < feng_srv.config_context->used; i++)
            if ( !feng_srv.config_storage[i].access_log_syslog &&
                 feng_srv.config_storage[i].access_log_fp != NULL )
                fclose(feng_srv.config_storage[i].access_log_fp);
}
#endif

You can find the rest of the code over to the LScube GitHub repository — have fun!

8 thoughts on “Cleaning up after yourself

  1. One place where C++ is naturally simpler and cleaner, and without depending on any compiler-specific features… Allocate your object on the stack or with a smart pointer, get out of scope… it’s done ! :-)

    Like

  2. Not really; what I’m doing here is implementing a basic singleton; in C++ you still either have the destructor called or you have to surround the destructor with @#ifndef NDEBUG@.The only thing you spare there is the check for the destructor attribute, simply because it’s already part of C++ standard. But at which cost does that happen?

    Like

  3. GCC destructors are evil shit. People should really think twice or even thrice when using them. One of the most obvious dangers is that they are executed from the thread calling exit() at a time where other threads continue to run normally. That means that while you execute your destructing code some other thread might still access your data structures. In particular that means that you should never ever delete mutexes via gcc destructors, or anything they protect.Also, gcc destructors in libraries are called on unload, not on exit. Often enough this is a big difference with effect on control flow (think loadable modules, or normal libraries pulled in via loadable modules)in other words: gcc destructors don’t mix well with libs and not well with threads. If you use threads or write a library stay far away from then. Even better: stay away from the always.The only place where they are kinda ok to use is for debug/valgrind stuff, but even then you should make them nops if valgrind isn’t used.gcc destructors are evil, evil shit. Advocate against their use, don’t advocate them!

    Like

  4. Lennart, you attack the wrong target. Destructors are a very useful feature, which is part of the reason that C++ developers use them so much. Running destructors _at global scope_ has the problems you mention. For this reason, some (maybe many?) C++ “coding standards” documents advocate against placing non-trivial objects in global scope, so as to avoid the problems you describe. Running a destructor in local scope (e.g. std::auto_ptr) to clean up automatic variables can be a good alternative to using goto or having multiple exit paths that must free all resources taken so far.For cases where placing the destructor in global scope is unavoidable, you can mitigate the threading/library problems by requiring that, prior to triggering a call to _fini, application code calls a function that halts threads, ensures resources are not locked, and so on. Enforce this by having the halting function set a global variable indicating it has done this, and have the destructors assert that the variable is set. It is not perfect, but gives a chance that anyone who runs an assertions-enabled build will notice if an exit path failed to perform an orderly shutdown.

    Like

  5. In fact I was just talking about C++ destructors, that are native and standard. You, who defend GCC’s alternatives, should know how hard it can be to have an un-standard extension implemented on more than one compiler.So, I thought it was amusing to note that, rather than use C++, on THIS point, you prefer an un-standard GCC extensions.About singletons, the only problem of C++ (and C, and probably all other compiled languages, by the way), is to ensure that your singleton creation is safely protected even during the first construction, without using heavy atomic synchronisation (double lock is not considered safe because of compiler optimisations, and THAT is a point where GCC may change things with -Ox, by the way). But you did not try to address or hide THIS one, right ? A singleton should in most cases be initialised and freed with the application, anyway…I, personnaly, do like C++, I just think it’s a monstruous and dangerous crap for who does not master or even master AND like it. I was amused to note that here, on a thing as simple as object destruction, it can be a better solution, and in a simpler and more natural way, than C. I understand you do not want to pay the price…Re-read my comment with a bit of humour ! :-)

    Like

  6. Yes, C++ has native and standard destructors, and that’s a positive note; the singleton destructors have, though, the same exact problems as the destructor functions I noted in my example and Lennart (rightly) despise.Indeed, the implementation of singleton destructors is tied with the @.fini@ section that this method uses as well; they are more or less the same.Alas, C# does not seem to implement proper destructors either, so yes in all this, C++ has a practical advantage; on the other hand I don’t think that I’m “relying” with non-standard GCC extensions here: the destructor attribute is supported by any compiler who even think about supporting ELF (because of @.fini@): gcc, icc, clang, Sun Studio.Also, since it’s only debug paths that I’m using here, even if a few compilers failed to build it properly, I’d be fine. As Lennart said, destructors are mostly hacks that can be used safely _only_ for debug purposes, as any other purpose makes them harder to deal with than it’s worth.

    Like

  7. You go through this whole post without mentioning garbage collection.I find it dispiriting that people use languages like C and C++ because they are “efficient” (ie. in some very specific circumstances it’s possible to beat a GC by allocating memory manually). Yet those same people don’t even think about laying out blocks on the disk manually, instead leaving that up the operating system. Yet using your own hand-written disk block code instead of the OS filesystem would be much more efficient in some narrow circumstances, right? Or using manual code overlays and manual swap placement instead of using the general swap system done by the operating system. Disk is much much slower than RAM so surely if you’re prepared to give up GC, giving up OS niceties like filesystems should be far more important to you.(For the sarcasm-impaired, I am being sarcastic to make a point).

    Like

  8. The problem is that GC itself can be quite a PITA, with most of the libraries I know because those languages are too low-level…Plus in this case it was really not a matter of garbage collecting..

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s