Parsing configuration files

One of the remaining big issues I have with feng, one of the areas that we haven’t yet rewritten completely from since I joined, is the code to parse its configuration file. I have written about it not too long ago but I hadn’t received enough comments to move me away from the status quo yet. But since we’re now trying our best to implement whole new features that will require new configuration file options, it’s time for the legacy code to be killed.

In the previous post I linked, I was considering the use of an ISC-style configuration file derived from the syntax used by dhcpd.conf; since then I decided to switch on the very-similar bind-style syntax used by named.conf – it’s almost the same, with the difference that there are no “top-level” options, and that should make it easier to parse – and then I even tried to look into re-using the parsing code directly from bind.

Interestingly, the bind distribution contains an almost standalone configuration file parser code, with a prebuilt lexer. Unfortunately, re-using that code directly was not an option: while it was providing a clean, split interface, its backend code is very much entangled with the bind internals; including, but not limited to, their own memory allocators. Trying to reduce these dependencies would require as much work as creating something anew.

So the other night I decided I would find a way to implement the new parser… I spent the whole night, till 10am the morning after, to find possible alternatives; the obvious choice would have been using the good old classic unix tools: lex & yacc… unfortunately the tutorials that I could Google over, and the not-really-mine copy of the eponymous O’Reilly book didn’t help me at all. After deciding that half the stuff I was reading was obsoleted and old, I decided to tackle the problem in another direction.

John Levine – of Linkers & Loaders fame, a cornerstone for linker geeks like me – was one of the authors of the original lex & yacc book, the last edition of which was published in 1992; he has since updated the book by writing flex & bison and describing the modern versions of the tools. Thanks to O’Reilly’s digital distribution buying a copy of the book was both fast and cheap: with the 50% off discount over two books I got both that and Being Geek – a book I have been curious about for a while but I didn’t care just enough to buy it alone – for $20 total. Nice deal.

More importantly, not half an hour after downloading the PDF version of the book I was able to complete the basic parser… which makes it a totally worthwhile investment. Some of the problems I was facing were mostly due to legacy of the old tools, and with modern documentation at hand, it became trivial to implement what I needed. On the other hand, it turned out to be a verbose, and repetitive task, so once again I resolved to my “usual” Rube Goldberg machines by using XML and XSLT… (I really really have to find a different approach sooner or later).

The current code I’m working on has a single, short XML document that describes the various sections and entries in the configuration file, and a number of (not-too-long) stylesheets that produce the input for flex/bison, and a header file with the C structures with the options; add to this, is a separate source file with pure-C callbacks to commit the read configuration data into C structures, presetting defaults where needed, and validating the semantic of the code.

Right at this moment, feng has the new config parser set in; together with the stuff I developed for it, I also posted patches for the Autoconf Archive to improve the checks for flex and bison themselves (not simple lex/yacc), and a few more fixes there. All in all, it seems to be a decrease of many kilobytes of compiled object code size.

Protecting yourself from R-U-Dead-Yet attacks on Apache

Do you remember the infamous “slowloris” attack over HTTP webservers? Well, it turns out there is a new variant of the same technique, that rather than making the server wait for headers to arrive, makes the server wait for POST data before processing; it’s difficult to explain exactly how that works, so I’ll leave it to the expert explanation from ModSecurity

Thankfully, since there was a lot of work done to cover up the slowloris attack, there are easy protections to be put in place, the first of which would be the use of mod_reqtimeout… unfortunately, it isn’t currently enabled by the Gentoo configuration of Apache – see bug #347227 – so the first step is to workaround this limitation. Until the Gentoo Apache team appears again, you can do so simply by making use of the per-package environment hack, sort of what I’ve described in my previous nasty trick spost a few months ago.

# to be created as /etc/portage/env/www-servers/apache

export EXTRA_ECONF="${EXTRA_ECONF} --enable-reqtimeout=static"

*Do note that here I’m building it statically; this is because I’d suggest everybody to build all the modules as static; the overhead of having them as plugins is usually quite higher than what you’d have for loading a module that you don’t care about.*

Now that you got this set up, you should ensure to set a timeout for the requests; the mod_reqtimeout documentation is quite brief, but shows a number of possible configurations. I’d say that in most cases, what you want is simply the one shown in the ModSecurity examples. Please note that they made a mistake there, it’s not RequestReadyTimeout but RequestReadTimeout.

Additionally, when using ModSecurity you can stop the attack on its track after a few requests timed out, by blacklisting the IP and dropping its connections, allowing slots to be freed for other requests to arrive; this can be easily configured through this snipped, taken directly from the above-linked post:

RequestReadTimeout body=30

SecRule RESPONSE_STATUS "@streq 408" "phase:5,t:none,nolog,pass, setvar:ip.slow_dos_counter=+1,expirevar:ip.slow_dos_counter=60"
SecRule IP:SLOW_DOS_COUNTER "@gt 5" "phase:1,t:none,log,drop, msg:'Client Connection Dropped due to high # of slow DoS alerts'"

This should let you cover yourself up quite nicely, at least if you’re using hardened, with grsecurity enforcing per-user limits. But if you’re using hosting where you don’t have decision over the kernel – as I do – there is one further problem: the init script for apache does not respect the system limits at all — see bug #347301 .

The problem here is that when Apache is started during the standard system init, there are no limits set for the session is running from, and since it doesn’t use start-stop-daemon to launch the apache process itself, no limits are applied at all. This results in a quite easy DoS over the whole host as it will easily exhaust the system’s memory.

As I posted on the bug, there is a quick and dirty way to fix the situation by editing the init script itself, and change the way Apache is started up:

# Replace the following:
        ${APACHE2} ${APACHE2_OPTS} -k start

# With this

        start-stop-daemon --start --pidfile "${PIDFILE}" ${APACHE2} -- ${APACHE2_OPTS} -k start

This way at least the system generic limits are applied properly. Though, please note that start-stop-daemon limitations will not allow you to set per-user limits this way.

On a different note, I’d like to spend a few words on telling why this particular vulnerability is interesting to me: this attack relies on long-winded POST requests that might have a very low bandwidth, because just a few bytes are sent before the timeout is hit… it is not unlike the RTSP-in-HTTP tunnelling that I have designed and documented in feng during the past years.

This also means that application-level firewalls will start sooner or later filtering these long-winded requests, and that will likely put the final nail on the coffin of the RTSP-in-HTTP tunnelling. I guess it’s definitely time for feng to move on and implement real HTTP-based pseudo-streaming instead.

Implementing the future! (or, how feng is designing secure RTSP)

While LScube is no longer – for me – a paid job project (since I haven’t been officially paid to work on it since May or so), it’s a project that intrigues me well enough to work on it on free time, rather than, say, watch a movie. The reason is probably to find in situations like those in the past few days, when Luca brings up a feature to implement that either no one else has implemented before, or simply needs to be re-implemented from scratch.

Last time I had to come up with a solution to implement something this intricate was when we decide to implement Apple’s own HTTP tunnelling feature whose only other documentation I know of, beside the original content from Apple is the one I wrote for LScube itself.

This time, the challenge Luca shown me is something quite more stimulating because it means breaking new ground that up to now seems to have been almost entirely missing: RTSP recording and secure RTSP.

The recording feature for RTSP has been somewhat specified by the original RFC but it was left quite up in the air as to what it should have been used for. Darwin Streaming Sever has an implementation, but beside the fact that we try not to look at it too closely (to avoid pollination), we don’t agree on the minimum amount of security that it provides as far as we can tell. But let’s start from the top.

The way we need RECORD to work, is to pass feng (the RTSP server) a live stream, which feng will then multiplex to the connected users. Up to now, this task was care of a special protocol between flux (the adaptor) and feng, which also went to a few changes over time: originally it used UDP over localhost, but we had problems with packet loss (no I can’t even begin to wonder why) and in performance; nowadays it uses POSIX message queues, which are supposedly portable and quite fast.

This approach has, objectively, a number of drawbacks. First of all the MQ stuff is definitely not a good idea to use; it’s barely implemented in Linux and FreeBSD, it’s not implemented on Darwin (and for a project whose aim is to replace Darwin Streaming Server this is quite a bad thing) and even though it does work on Linux, its performances aren’t exactly exciting. Also, using POSIX MQs is quite tricky: you cannot simply write an eventloop-based interface with libev because, even though they are descriptors in their implementation, an MQ descriptor does not support the standard poll() interface (nor any of the more advanced ones) so you end up with a thread blocking in reading… and timing out to make sure that flux wasn’t restarted in the mean time, since in that case, a new pair of MQs would have been created.

This situation was also not becoming any friendlier by the fact that, even though the whole team is composed of fewer than five developers, flux and feng are written in two different languages, so the two sides of the protocol implemented over MQs are also entirely separated. And months after I dropped netembryo from feng it is still used within flux, which is the one piece of the whole project that needed it the least, as it uses none of SCTP, TCP and SSL abstractions, and the interfaces of the two couldn’t be more far apart (flux is written in C++ if you didn’t guess, while netembryo is also written in C).

So it’s definitely time to get rid of the middleware and let the producer/adapter (ffmpeg) talk directly with the server/multiplexer (feng). This is where RECORD is necessary: ffmpeg will open a connection to feng via the same RTSP protocol used by the clients, but rather than ask to DESCRIBE a resource, it will ANNOUNCE one; then it will SETUP the server so that it would open RTP/RTCP ports, and start feeding data to the server. Since this means creating new resources and making it available to third parties (the clients) – and eventually creating files on the server’s disk, if we proceed to the full extent of the plan – it is not something that you want just any client to be able to do: we want to make sure that it is authorised to do so — we definitely don’t want to pull a Diaspora anytime soon.

Sequence Diagram of feng auth/record support

And in a situation like this, paying for a decent UML modeler, shows itself as a pretty good choice. A number of situations became apparent to me only when I went to actually think of the interaction to draw the pretty diagrams that we should show to our funders.

Right now, though, we don’t even support authentication, so I went to look for it; RTSP was designed first to be an HTTP-compatible protocol, so the authentication is not defined in RFC2326, but rather to be defined by HTTP itself; this means that you’ll find it in RFC 2616 instead — yes it is counterintuitive, as the RFC number is higher; the reason is simple though: RFC2616 obsoletes the previous HTTP description in 2068 which is the one that RTSP refers to; we have no reason to refer to the old one knowing that a new one is out, so…

You might know already that HTTP defines at least two common authentication schemes: Basic (totally insecure) and Digest (probably just as insecure, but obfuscated enough). The former simply provides username and password encoded in base64 — do note: encoded, not encrypted; it is still passed in what can be called clear text, and is thus a totally bogus choice to take. On the other hand, Digest support provides enough features and variables to require a very complex analysis of the documentation before I can actually come up with a decent implementation. Since we wanted something more akin to a Proof-of-Concept, but also something safe enough to use in production, we had to find a middle ground.

At this point I had two possible roads in front of me. On one side, implementing SASL looked like a decent solution to have a safe authentication system, while keeping the feng-proper complexity at bay; unfortunately, I still haven’t found any complete documentation on how SASL is supposed to be implemented as part of HTTP; the Wikipedia page I’m linking to also doesn’t list HTTP among the SASL-supported protocols — but I know it is supposed to work, both because Apache has a mod_authn_sasl and because if I recall correctly, Microsoft provides ActiveDirectory-authenticated proxies, and again if my memory serves me correctly, ActiveDirectory uses SASL. The other road looked more complex at first but it promised much with (relatively) little effort: implementing encrypted, secure RTSP, and require using it to use the RECORD functionality.

While it might sound a bit of overkill to require an encrypted secure connection to send data that has to be further streamed down to users, keep in mind that we’re talking only of encrypting the control channel; in most cases you’d be using RTP/RTCP over UDP to send the actual data, and the control stream can even be closed after the RECORD request is sent. We’re not yet talking about implementing secure RTP (which I think was already drafted up or even implement), but of course if you were to use a secure RTSP session and interleaved binary data, you’d be getting encrypted content as well.

At any rate, the problem with this became how to implement the secure RTSP connection; one option would have been using the oldest trick in the book and use two different ports, one for clear-text RTSP and another for the secure version, as is done for HTTP and HTTPS. But I also remembered that this approach has been frown upon and deprecated for quite a bit. Protocols such as SMTP and IMAP now use TLS through a STARTTLS command that replaces a temporarily clear-text connection with an encrypted TLS on; I remember reading about a similar approach for HTTP and indeed, RFC 2817 describes just that. Since RTSP and HTTP are so similar, there should be no difficulty into implementing the same idea over RTSP.

Sequence Diagram of feng TLS/auth/RECORD support

Now in this situation, implementing secure ANNOUNCE support is definitely nicer; you just refuse to accept methods (in write-mode) for the resource unless you’re over a secure TLS connection; then you also ensure that the user authenticated before allowing access. Interestingly, here the Basic authentication, while still bad, is acceptable for an usable proof of concept, and in the future it can easily be extended to authenticate based on client-side certificates without requiring 401 responses.

Once again, I have to say that having translated Software Engineering wasn’t so bad a thing; even though I usually consider UML being just a decent way to produce pretty diagrams to show the people with the gold purse, doing some modelling of the interaction between the producer and feng actually cleared up my mind enough to avoid a number of pitfalls, like doing authentication contextually to the upgrade to TLS, or forgetting about the need to provide two responses when doing the upgrade.

At any rate, for now I’ll leave you with the other two diagrams I have drawn but wouldn’t fit the post by themselves, they’ll end up on the official documentation — and if you wonder what “nq8” stands for, it’s the default extension generated by pngnq: Visual Paradigm produces 8-bit RGB files when exporting to PNG (and the SVG export produces files broken with RSVG, I have to report that to the developers), but I can cut in half the size of the files by quantising them to a palette.

I’m considering the idea of RE part of the design of feng into UML models, and see if that gives me more insight on how to optimise it further, if that works I should probably consider spending more time over UML in the future… too bad that reading O’Reilly PDF books on the reader is so bad, otherwise I would be reading through all of “Learning UML 2.0” rather than skimming through it when I need.

Runtime vs buildtime

No this time I’m not talking about failures but of parsing.

For feng I’m now working on replacing the main configuration parser with something neater and possibly simpler. Unfortunately, as it happens, most of the parsers out there are actually based on the concept of flex and yacc of splitting the lexing and parsing into two different functionalities. This is all fine and dandy when you have extensible syntax, and all the frills, but I find it a bit cumbersome, as we split the work both at runtime and build time.

This is why for the smaller stuff we’ve been using Ragel: we can embed lexing and parsing in a single action, which is very fast (and thus very good to parse the data that arrives from the client, which we have to answer to right away); unfortunately while generally flexible, me and Luca found that Ragel has its limits — one of these is with recursive definitions, which luckily we don’t need in the context of feng, but which I tried to use for Ruby-Elf to demangle C++ names.

On other hand I found before that Ragel tend to be quite flexible when used as an intermediate language; in feng, to properly parse request lines (and in particular the method name) I use a (bloated, admittedly) Goldberg Machine made of XML, XSLT and Ragel, that translate a simple list of protocols and correlated methods to C code that parse the request lines. This happens because Ragel has no backtracking, and I’d have to parse the same line twice to let it backtrack. Alternatives would be validating the methods against the protocol, but that’s also a difficult thing to do… so for now this will do.

But already when I wanted to replace the custom “sd” format, which was implemented as a series of fgets() and g_ascii_strcasecmp() (which it’s a definite waste because it was lowering the text on both sides every time!) over a loop, Ragel didn’t seem like that much of a good choice. At the end I went with simply using the already-present glib parser for INI-like files, but that didn’t reduce the code as much as I wished, because I still had to look up the configuration lines with strings, from the loaded file structure.

So from one side the GLIB parser is neat enough because it’s extensible: anything that is not known for, but has a valid syntax, is kept in memory so that extended files can be used by older versions of feng without problems; on the other hand, this also means that if the file is huge, nonsense is also loaded into memory. While it does strip comments, it doesn’t strip unknown elements because it doesn’t know which elements are valid.

On the other side, argument parsing is usually doing the other way around: rather than letting the parser understand everything and then ask it whether there are the parameters you’re looking for, you write in the code a table describing the accepted parameters and then the parser compares each option against that table; depending on the format/library/concept used it either sets variables around, or call functions registered as callbacks to tell you that the option was encountered. Unrecognised options can either be ignored or will cause an error.

Special format parsers, such as most JSON parsers, seem to do something similar: they let you register callbacks that are called when specific tokens are detected… the result sincerely is often enough clumsy, to me, and the compiler has no way to understand well what you’re doing. Others such as json-c prefer to load everything into memory, and then you’re up on your own to parse it; if it’s structured data… good luck.

The lemon parser generator that lighttpd (and thus feng, for good or worse) used seems to have produced a combined approach: on one side the configuration file is parsed and loaded into memory and on the other side two long tables are used to scan through the parsed results; this N-pass method, though, is complex though, and still does a lot more work than it should because I cannot report an error to the user on a broken constrain with a line number of where the error was in the first place.

So I’m now considering writing a Ragel parser generator: you describe in some verbose way the format of the configuration file, define actions of what to do when a described configuration variable is found, add support for producing an error to the user if there is a mistake in the configuration variable… this kind of things. What I’m not sure of is the format.

Sincerely, out of experience, the configuration file format that makes more sense for feng is the one used by ISC for DHCP and Bind: it provides for top-level global parameters, then named and unnamed sections. Luca’s original idea was that of using the same syntax as lighttpd to be able to reuse the files and share part of it, but we never came to have as much features as that, and it actually is showing itself to be troublesome; the fact that you use comparison to define vhosts and sockets makes it very difficult to deal with.

Even more so, experience shows me that we need to keep separated the listening interfaces (and their parameters) from the virtual hosts; it gets easier to do since at least RTSP mandates providing the full URL on requests rather than just the local path. What I’m aiming at this point would be something along these lines:

log level debug; # or 5
log syslog;

user feng;
group feng;

socket {
    port 8554;
}

socket {
    port 554;
    family all; # or ipv4 or ipv6
    sctp on;
    sctp streams 16;
}

vhost myhost.tld {
    root /var/lib/feng/myhost;
    log access syslog;
}

vhost yourhost.tld {
    root /var/lib/feng/yourhost;
    log access /var/log/feng/yourlost.tld;
}

Autotools Mythbuster: autoscan? autof—up

Beside the lack of up-to-date and sane documentation about autotools (for which I started my guide that you should remember is still only extended in my free time), there is a second huge user experience problem: the skeleton build system produced by the autoscan script.

Now, I do understand why they created it, the problem is that as it is, it mostly creates fubar’d configure.ac skeletons, that confuse newcomers and causes a lot of grief to packagers and to users of source-based distributions (and those few who still think of building software manually without getting it from their distribution).

The problem with autoscan is that it embodies again the “GNU spirit”, or actually the GNU spirit of the original days, back when GNU tried to support any operating system, to “give freedom” to users forced to use those OSes rather than Linux itself. Given that nowadays FSF seems to interest itself mostly on discouraging anybody from using non-free operating systems (or even, non-GNU based operating systems) – sometimes failing and actually discouraging companies from using Free Software altogether – it seems like they had a change of ideas in the middle of that. But that’s something for another post.

Anyway, assuming that you’ll have to make your software work on any operating system out there is something that you are well unlikely to do. First of all, a number of projects nowadays, for good or bad, target Linux only; sometimes even just GNU/Linux (that is they don’t support running on other C libraries) because they require specific features from the kernel, specific drivers and other requirements like that. Secondly, you can easily require your users to have a sane environment to begin with; unless you really have to run on a 15 years old operating system, you can assume at least some basic standard support. I have already written about pointless autotools checks but I guess I didn’t make it too clear yet.

But it’s not just the idea of just dropping support for anything that does not support a given standard, whatever that might be (C99, POSIX.1-2008, whatever), it’s more that the configure.ac generated by autoscan is not going to make it magically work on a number of operating systems it didn’t support before. What it does, for the most part, is adding a number of AC_CHECK_HEADERS and AC_CHECK_FUNCS calls, which will verify the presence of various functions and headers that your software is using… but it won’t change the software to provide alternatives; heck there might not be alternatives.

So if your software keeps on using strings.h (which is POSIX) and you check for it in configure phase, you’re just making the configure phase longer without any solution, because you’re not making use of the results from the configure phase. Again, this often translates to things like the following:

#ifdef HAVE_STRINGS_H
#include 
#endif

Okay, so what is the problem with this idea? Well, to begin with, I have seen it so many times without an idea of why it is there! A number of people expect that since autoscan added the check, and thus they have the definition, they have to use it. But if you use a function that is defined in that header, and the header is not there, what are you going to do? Not including it is not going to make your software any more portable, if anything you’re going to get an implicit declaration of the function, and probably fail later at runtime. So, if it’s not an optional header, or function, just running the check and using the definition is not enough.

A common alternative is to fail the configure step if the header or function is not found, while it makes a bit more sense, I still dislike that option, Sure you might be able to tell the reason why the function is needed and whether they have to install something else or upgrade their system, but in truth that made much more sense when there was near to no common ground between operating systems, and users were the common people running the ./configure script. Nowadays, that’s a task that is often limited to packagers, that know their systems much better. The alternative to failing in configure is failing during build, and it’s generally not too bad. Especially since you’ll be failing build for any condition you didn’t know about beforehand.

I have another reason to provide, as for why you shouldn’t be running all those tests for things you don’t support a fallback for: autoconf provides means to pass external libraries and include directives to the compiler; since having each package provide its replacement for common function is going to cause a tremendous amount of code duplication (which in turn may cause a lot of work for packages if one of them is broken, such as dtoa() anybody remembers that?), I’m always surprised that there aren’t many more libraries that provide compatibility replacements for the functions missing in the system C library (gnulib does not count as it’s solving the problem with code replication, if not duplication). Rather than fail, or trying to understand whether you can build or not depending on the OS used, just assume their presence if you can’t go without, and leave it to the developers running that system to come up with a fix, which might involve additional tests, or might not.

My suggestion here is thus to start considering first the operating systems you’re targeting directly; try to find what actually changes between them; in most cases, for instance, you might have still pieces of very-old systems around, like the include for malloc.h that is only useful if you want to call functions such as memalign() but is not used for malloc() since, well, ever (stdlib.h is enough for that), and that will cause errors on both FreeBSD and OS X if included. So once you find that a header is not present in some of your desired operating system, look up what replaces it, then make sure to check for it properly; that means using something like this:

dnl in configure.ac
AC_CHECK_HEADERS([stdint.h inttypes.h], [break;])

/* in your C code */
#if HAVE_STDINT_H
#  include 
#else HAVE_INTTYPES_H
#  include 
#endif

This way you won’t be running checks for a number of alternative headers on all systems: most modern C99-compatible system libraries will have stdint.h available, even though a few older systems will need for inttypes.h to be discovered instead. This might sound cheap, since it’s just two headers, but especially when you’re looking for the correct place of a library header, you might end up with an alternative among three or four headers, and add a bunch of alternatives here, and you’re going to have problems. The same trick can be used for functions, and the description is also on my guide so and I’ll soon expand it to cover functions as well.

It shouldn’t “sound wrong” to have a configure.ac with near to no AC_CHECK_* call at all! Most of the tests autoconf will do for you, and you have the ability to add further, but there is no need to strain at using them when they are unneeded. Take as example the feng configure.ac — it has a few checks, of course, but they are limited to a few optional features that we workaround if missing, in the code itself. And some I would probably just remove (like making ipv6 support optional… I’d sincerely just make it work if it’s found on the system, as you still need to enable it in the configuration file to use it anyway).

And please, please, just don’t use autoscan, from now on!

Cleaning up after yourself

I have noted in my post about debug information that in feng I’m using a debug codepath to help me reduce false positives in valgrind. When I wrote that I looked up an older post, that promised to explain but never explained. An year afterwards, I guess it’s time for me to explain, and later possibly document this on the lscube documentation that I’ve been trying to maintain to document the whole architecture.

The problem: valgrind is an important piece of equipment in the toolbox of a software developer; but as any other tool, it’s also a tool; in the sense that it’ll blindly report the facts, without caring about the intentions the person writing the code had at the time. Forgetting this leads to situation like Debian SA 1571 (the OpenSSL debacle), where an “unitialised value” warning was intended like the wrong thing, where it was actually pretty much intended. At any rate, the problem here is slightly different of course.

One of the most important reasons to be using valgrind is to find memory leak: memory areas that are allocated but never freed properly. This kind of errors can make software either unreliable or unusable in production, thus testing for memory leak for most seasoned developers is among the primary concerns. Unfortunately, as I said, valgrind doesn’t understand the intentions of the developers, and in this context, it cannot discern between memory that leaks (or rather, that is “still reachable” when the program terminates) and memory that is being used until the program stops. Indeed, since the kernel will free all the memory allocated to the process when it ends, it’s a common doing to simply leave it to the kernel to deallocate those structures that are important until the end of the program, such as configuration structures.

But the problem with leaving these structures around is that you either have to ignore the “still reachable” values (which might actually show some real leaks), or you receive a number of false positive introduced by this practice. To remove the false positive, is not too uncommon to free the remaining memory areas before exiting, something like this:

extern myconf *conf;

int main() {
  conf = loadmyconf();
  process();
  freemyconf(conf);
}

The problem with having code written this way is that even just the calls to free up the resources will cause some overhead, and especially for small fire-and-forget programs, those simple calls can become a nuisance. Depending on the kind of data structures to free, they can actually take quite a bit of time to orderly unwind it. A common alternative solution is to guard the resource-free calls with a debug conditional, of the kind I have written in the other post. Such a solution usually ends up being #ifndef NDEBUG, so that the same macro can get rid of the assertions and the resource-free calls.

This works out quite decently when you have a simple, top-down straight software, but it doesn’t work so well when you have a complex (or chaotic as you prefer) architecture like feng does. In feng, we have a number of structures that are only used by a range of functions, which are themselves constrained within a translation unit. They are, naturally, variables that you’d consider static to the unit (or even static to the function, from time to time, but that’s just a matter of visibility to the compiler, function or unit static does not change a thing). Unfortunately, to make them static to the unit you need an externally-visible function to properly free them up. While that is not excessively bad, it’s still going to require quite a bit of work to jump between the units, just to get some cleaner debug information.

My solution in feng is something I find much cleaner, even though I know some people might well disagree with me. To perform the orderly cleanup of the remaining data structures, rather than having uninit or finalize functions called at the end of main() (which will then require me to properly handle errors in sub-procedures so that they would end up calling the finalisation from main()!), I rely on the presence of the destructor attribute in the compiler. Actually, I check if the compiler supports this not-too-uncommon feature with autoconf, and if it does, and the user required a debug build, I enable the “cleanup destructors”.

Cleanup destructors are simple unit-static functions that are declared with the destructor attribute; the compiler will set them up to be called as part of the _fini code, when the process is cleared up, and that includes both orderly return from main() and exit() or abort(), which is just what I was looking for. Since the function is already within the translation unit, the variables don’t even need to be exported (and that helps the compiler, especially for the case when they are only used within a single function, or at least I sure hope so).

In one case the #ifdef conditional actually switches a variable from being stack-based to be static on the unit (which changes quite a bit the final source code of the project), since the reference to the head of the array for the listening socket is only needed when iterating through them to set them up, or when freeing them; if we don’t free them (non-debug build) we don’t even need to save it.

Anyway, where is the code? Here it is:

dnl for configure.ac

CC_ATTRIBUTE_DESTRUCTOR

AH_BOTTOM([#if !defined(NDEBUG) && defined(SUPPORT_ATTRIBUTE_DESTRUCTOR)
           # define CLEANUP_DESTRUCTOR __attribute__((__destructor__))
           #endif
          ])

(the CC_ATTRIBUTE_DESTRUCTOR macro is part of my personal series of additional macros to check compiler features, including attributes and flags).

And one example of code:

#ifdef CLEANUP_DESTRUCTOR
static void CLEANUP_DESTRUCTOR accesslog_uninit()
{
    size_t i;

    if ( feng_srv.config_storage )
        for(i = 0; i < feng_srv.config_context->used; i++)
            if ( !feng_srv.config_storage[i].access_log_syslog &&
                 feng_srv.config_storage[i].access_log_fp != NULL )
                fclose(feng_srv.config_storage[i].access_log_fp);
}
#endif

You can find the rest of the code over to the LScube GitHub repository — have fun!

Wrong abstractions; bad abstractions

In the past weeks I’ve been working again on LScube and in particular working toward improving the networking layer of feng. Up until now, we relied heavily on our own library (netembryo) to do almost all of the networking work. While this design looked nice at the start, we reached the point where:

  • the original design had feng with multiple threads to handle “background tasks”, and a manually-tailored eventloop, which required us to reinvent the wheel so much that it was definitely not funny; to solve this we’ve gone to use libev, but that requires us to access the file descriptor directly to set the watcher;
  • we need to make sure that the UDP socket is ready to send data to before sending RTP and RTCP data, and to do so, we have to go down again on the file descriptor;
  • while the three protocols currently supported by feng for RTP transports (UDP, TCP and SCTP) were all abstracted by netembryo, we had to branch out depending on the used protocol way deep inside feng, as the working of the three of them was very different (we had similitudes between UDP and SCTP, and between SCTP and TCP interleaved, but they were not similar enough in any way!); this resulted in double-branching, which the compiler – even with LTO – will have a hard time understanding (since it depends on an attribute of the socket that we knew already;
  • the size of objects in memory was quite bigger than it should have been, because we had to keep them around all the needed information for all the protocols, and at the same time we had to allocate on the stack a huge number of identical objects (like the SCTP stream information to provide the right channel to send the RTP and RTCP packets to);
  • we’ve been resolving the client’s IP addresses repeatedly every time we wanted to connect the UDP sockets for RTP and RTCP (as well as resolving our own UP address0;
  • we couldn’t get proper details about the errors with network operations, nor we could fine-tune those operations, because we were abstracting all of it away.

While I tried fixing this by giving netembryo a better interface; I ended up finding that it was much, much simpler to deal with the networking code within feng without the abstraction; indeed, in no case we needed to go the full path down to what netembryo used to be beforehand; we always ended up short from that, skipping more than a couple of steps. For instance, the only place where we actually go through with the address resolution is the main socket binding code, where we open the port we’ll be listening on. And even there, we don’t need the “flexibility” (faux-flexibility actually) that netembryo gave us before: we don’t need to re-bind an already open socket; we also don’t want to accept one out of a series of possible addresses, we want all of them or none (this is what helps us supporting IPv6 by the way).

The end result is not only a much, much smaller memory footprint (the Sock structure we used before was at least 440 bytes, while we can stay well behind the 100 bytes per socket right now), but also less dependencies (we integrate all the code within the feng project itself), less source code (which is always good, because it means less bug), tighter handling of SCTP and interleaved transmission, and more optimised code after compilation. Talk about win-win situations.

Unfortunately not everything is positive from what we saw up to now; the fact that we have no independent implementation (yet) of SCTP makes it quite difficult to make sure that our code is not bugged in some funny way; and even during this (almost entire) rewrite of network code, I was able to find a number of bugs, and a few strange situations that I’ll have to deal with right now, spending more than a couple of hours to make sure that it’s not going to break further on. It also shows we need to integrate valgrind within our current white-box testing approach to make sure that our functions don’t leak memory (I found a few by running valgrind manually).

To be honest, most of the things I’ve been doing now are nothing especially difficult for people used to work with Unix networking APIs, but I most definitely am not an expert of those. I’m actually quite interested in the book if it wasn’t that I cannot get it for the Reader easily. So if somebody feels like looking at the code and tell me what I can improve further, I’d be very happy.

One thing we most likely will have to pick up, though, is the SCTP userland code which right now has at least a few bugs regarding the build system, a couple of which we’re working around in the ebuild. So I won’t have time to be bored for a while still…

Gource visualising feng’s history — A story of Radeon

Or see this on YouTube — and yes this is quite ironic that we’ve uploaded to YouTube the visualised history of a streaming software stack.

The video you can see here is the GIT history of feng the RTSP streaming server me and Luca are working on for the LScube project, previously founded by the Turin Politechnic, visualised through Gource.

While this video has some insights about feng itself, which I’ll discuss on the mailing list of the project soon enough, I’m using this to bring home another point, one even more important I think. You probably remember my problems with ATI HD4350 video card … well, one of the reasons why I didn’t post this video before, even though Gource has been in tree (thanks to Enrico) for a while already, is that it didn’t work too well on my system.

It might not be too obvious, but the way Gource work is by using SDL (and thus OpenGL) to render the visualisation to screen and to (PPM) images – the video is then produced by FFmpeg that takes the sequence of PPM and encodes it in H.264 with x264; no I’m not going to do this with Theora – so you rely on your OpenGL support to produce good results. When 0.24 was being worked on (last January) the r700 Radeon driver, with KMS, had some trouble, and you’d see a full orange or purple frame from time to time, resulting in a not-too-appealing video. Yesterday I bit the bullet, and after dmesg has shown me a request from the kernel to update my userland, I rebuilt the Radeon driver from GIT, and Mesa from the 7.8 branch…

Perfect!

No crashes, no artefacts on glxgears, and no artefacts on Gource either! As you can see from the video above. This is with kernel 2.6.33 vanilla, Mesa 7.8 GIT and Radeon GIT, all with KMS enabled (and the framebuffers work as well!). Kudos to Dave, and all the developers working on Radeon, this is what I call good Free Software!

Just do yourself a favour, and don’t buy videocards with fans… leaving alone nVidia’s screwup with the drivers, all of them failed on me at some point, passive cards instead seem to work much longer, probably because of the lack of moving parts.

Configuration file ordeal

Following another lscube problem discussion, here comes another post of the series “why do I prefer one solution over another”, that I started thinking about after I ripped off the plugins interface from the current feng master branch (to be developed separately and brought back to the master once the interface is properly defined, and chosen and so on).

While ripping off the plugin I had to deal with the configuration file parser that we have in feng right now, which was heavily borrowed out of lighttpd. Luca’s idea has been that, since the scope of lighttpd and feng are quite similar, we could share the same configuration file syntax, and even code. Unfortunately, this ended up having a couple of problems: the first is that, as usual for Luca’s idea, they look so much forward that we only implement a small percentage of the right now: vhost, conditionals and all that stuff is something we not even barely start scratching them; the other is that the code used for the parser has changed quite a bit in lighttpd in the mean time; I think the original one was hand-tailored, then they moved to lemon-based parsing (the parser used by SQLite3) and now they are toying with Ragel to rewrite the mess that it’s now.

But in particular why is the current system a mess? Well the first problem is not really in the syntax itself, but rather n the current (lemon-based, I think) implementation: it uses a big table, which is relatively fine, but then it also references its entries with their direct array index, which means any change in the list of entries require to change all the following ones; not the nicest piece of code I had to deal with.

The second problem is also intrinsic in the way the code was written, partly by hand: there are some objects implemented for the configuration file parsing, like high-level pointer arrays and strings, which reinvent what glib already provides. These objects increase the size and complexity of our code with the only good reason that lighttpd does not use glib and thus have a need for them. Decoupling the configuration file code from these objects is non-trivial, and will diverge the code to the point we wouldn’t be able to share it with lighttpd any longer.

Now there is a certain complexity tied to the way we want the configuration file to behave: we want to have conditionals and blocks, and includes, similar to lighttpd and Apache configuration files, like most software that deals with servers and locations. The simplest configuration file formats don’t work very well for this task (for instance I don’t like the way Cherokee configuration files are written; they are designed to work with a web administration package, rather than hand-edited by an human). This kind of solution usually gets solved in two main ways: either by inventing your own syntax or using something like XML for the configuration file.

While XML makes it relatively easy to parse a configuration file (relatively because it’s just moving the issue of complexity; parsing XML is not usually faster than other configuration file syntax, it’s actually slower and much more complex; but generic XML parsers exist already that make the parsing easier for the final software), it definitely makes it obnoxious to read and edit by hand, which is why I don’t think XML is a good format for configuration files.

My favourite format for configuration file is the “good old” INI-like file format; I call it INI-like but I admit I’m not sure if Windows’s INI files were the ones using it first; on the other hand, the same format is nowadays used by so much stuff, including freedesktop’s .desktop files, and there are multiple well-tested parsers for it, including some (very limited, to be honest) support in glib itself.

Now, it’s obviously a no-brainer that the INI format is nowhere near powerful as the language used by Apache configuration files, or by lighttpd, since it bears no conditionals, and only has structure in the sense of sections, variables and values. But it’s exactly in this kind of simplicity that I think it might work well for feng nonetheless, given it works for git as well.

While the lighttpd syntax contains all kind of conditionals that allow setting particular limits based on virtual host, location, and generic headers, it’s quite likely that for feng we can reduce this to a tighter subset of conditions.

What we already have some small predisposition for is per-virtualhost settings, which are handled with an array (right now a single entry) of specific configuration structures, each of which is initialised first with the default value of the configuration and then overridden with the loaded configuration, and then support for some global non-replaceable settings. How does that fare with an INI-like configuration file? Well, to me it’s quite easy to think of it in this terms:

[global]
user = feng
group = feng
logdir = /var/log/feng
errorlog = error.log
sctp = on
listen = 0.0.0.0:554 localhost:8554

[default]
accesslog = access.log

[localhost]
accesslog =
accesslog_syslog = on

[127.0.0.1]
alias = localhost

[my.vhost.tld]
accesslog = my_access.log

[his.vhost.tld]
accesslog = /var/log/feng/another_access.log

This obviously is just a quick draft of how it could be achieved and does not mean that it should be achieved this way at all; but it’s just to say that maybe we should be looking into some alternative, even though keeping syntax-compatibility with lighttpd is a nice feature, it shouldn’t condition the extra complexity we’re currently seeing!

Plugins aren’t always a good choice

I’ve been saying this for quite a while, probably one of the most on-topic post has been written a few months ago but there are some indications about it in posts about xine and other again.

I used to be an enthusiast about plugin interfaces; with time, though, I started having more and more doubts about their actual usefulness — it’s a tract I really much like in myself, I’m fine with reconsidering my own positions over time, deciding that I was wrong; it happened before with other things, like KDE (and C++ in general).

It’s not like I’m totally against the use of plugins altogether. I only think that they are expensive in more ways than one, and that their usefulness is often overstated, or tied to other kind of artificial limitations. For instance, dividing a software’s features over multiple plugins makes it easier for the binary distributions to package them, usually: they only have to ship a package with the main body of the software, and many for the plugins (one per plugin might actually be too much so sometimes they might be grouped). This works out pretty well for both the distribution and, usually, the user: the plugins that are not installed will not bring in extra dependencies, they won’t take time to load and they won’t use memory for either code nor data. It basically allows binary distribution to have a flexibility to compare with Gentoo’s USE flags (and similar options in almost any other source-based distribution).

But as I said this comes with costs, that might or might not be worth it in general. For instance, Luca wanted to implement plugins for feng similarly to what Apache and lighttpd have. I can understand his point: let’s not load code for the stuff we don’t have to deal with, which is more or less the same reason why Apache and lighttpd have modules; in the case of feng, if you don’t care about access log, why should you be loading the access load support at all? I can give you a couple of reasons:

  • because the complexity of managing a plugin to deal with the access log (or any other similar task) is higher than just having a piece of static code that handles that;
  • because the overhead of having a plugin loaded just to do that is higher than that of having the static code built in and not enabled into configuration.

The first problem is a result of the way a plugin interface is built: the main body of the software cannot know about its plugins in too specific ways. If the interface is a very generic plugin interface, you add some “hook locations” and then it’s the plugin’s task to find how to do its magic, not the software’s. There are some exceptions to this rule: if you have a plugin interface for handling protocols, like the KIO interface (and I think gvfs has the same) you get the protocol from the URL and call the correct plugin, but even then you’re leaving it to the plugin to deal with doing its magic. You can provide a way for the plugin to tell the main body what it needs and what it can do (like which functions it implements) but even that requires the plugins to be quite autonomous. And that means also being able to take care of allocating and freeing the resources as needed.

The second problem is not only tied to the cost of calling the dynamic linker dynamically to load the plugin and its eventual dependencies (which is a non-trivial amount of work, one has to say), also by the need for having code that deals with finding the modules to load, the loading of those modules, their initialisation, keeping a list of modules to call at any given interface point, and two more points: the PIC problem and the problem of less-than-page-sized segments. This last problem is often ignored, but it’s my main reason to dislike plugins when they are not warranted for other reasons. Given a page size of 4KiB (which is the norm on Linux for what I know), if the code is smaller than that size, it’ll still require a full page (it won’t pack with the rest of the software’s code areas); but at least code is disk-backed (if it’s PIC, of course), it’s worse for what concerns variable data, or variable relocated data, since those are not disk-backed, and it’s not rare that you’d be using a whole page for something like 100 bytes of actual variables.

In the case of the access log module that Luca wrote for feng, the statistics are as such:

flame@yamato feng % size modules/.libs/mod_accesslog.so
   text    data     bss     dec     hex filename
   4792     704      16    5512    1588 modules/.libs/mod_accesslog.so

Which results in two pages (8KiB) for bss and data segments, neither disk-backed, and two disk-backed pages for the executable code (text): 16KiB of addressable memory for a mapping that does not reach 6KiB, it’s a 10KiB overhead, which is much higher than 50%. And that’s the memory overhead alone. The whole overhead, as you might guess at this point, is usually within 12KiB (since you got three segments, and each can have at most one byte less than page size as overhead — it’s actually more complex than this but let’s assume this is true).

It really doesn’t sound like a huge overhead by itself, but you have to always judge it compared to the size of the plugin itself. In the case of feng’s access log, you got a very young plugin that lacks a lot of functionality, so one might say that with the time it’ll be worth it… so I’d like to show you the size statistics for the Apache modules on the very server my blog is hosted. Before doing so, though, I have to remind you one huge difference: feng is built with most optimisations turned off, while Apache is built optimised for size; they are both AMD64 though so the comparison is quite easy.

flame@vanguard ~ $ size /usr/lib64/apache2/modules/*.so | sort -n -k 4
   text    data     bss     dec     hex filename
   2529     792      16    3337     d09 /usr/lib64/apache2/modules/mod_authn_default.so
   2960     808      16    3784     ec8 /usr/lib64/apache2/modules/mod_authz_user.so
   3499     856      16    4371    1113 /usr/lib64/apache2/modules/mod_authn_file.so
   3617     912      16    4545    11c1 /usr/lib64/apache2/modules/mod_env.so
   3773     808      24    4605    11fd /usr/lib64/apache2/modules/mod_logio.so
   4035     888      16    4939    134b /usr/lib64/apache2/modules/mod_dir.so
   4161     752      80    4993    1381 /usr/lib64/apache2/modules/mod_unique_id.so
   4136     888      16    5040    13b0 /usr/lib64/apache2/modules/mod_actions.so
   5129     952      24    6105    17d9 /usr/lib64/apache2/modules/mod_authz_host.so
   6589    1056      16    7661    1ded /usr/lib64/apache2/modules/mod_file_cache.so
   6826    1024      16    7866    1eba /usr/lib64/apache2/modules/mod_expires.so
   7367    1040      16    8423    20e7 /usr/lib64/apache2/modules/mod_setenvif.so
   7519    1064      16    8599    2197 /usr/lib64/apache2/modules/mod_speling.so
   8583    1240      16    9839    266f /usr/lib64/apache2/modules/mod_alias.so
  11006    1168      16   12190    2f9e /usr/lib64/apache2/modules/mod_filter.so
  12269    1184      32   13485    34ad /usr/lib64/apache2/modules/mod_headers.so
  12521    1672      24   14217    3789 /usr/lib64/apache2/modules/mod_mime.so
  15935    1312      16   17263    436f /usr/lib64/apache2/modules/mod_deflate.so
  18150    1392     224   19766    4d36 /usr/lib64/apache2/modules/mod_log_config.so
  18358    2040      16   20414    4fbe /usr/lib64/apache2/modules/mod_mime_magic.so
  18996    1544      48   20588    506c /usr/lib64/apache2/modules/mod_cgi.so
  20406    1592      32   22030    560e /usr/lib64/apache2/modules/mod_mem_cache.so
  22593    1504     152   24249    5eb9 /usr/lib64/apache2/modules/mod_auth_digest.so
  26494    1376      16   27886    6cee /usr/lib64/apache2/modules/mod_negotiation.so
  27576    1800      64   29440    7300 /usr/lib64/apache2/modules/mod_cache.so
  54299    2096      80   56475    dc9b /usr/lib64/apache2/modules/mod_rewrite.so
 268867   13152      80  282099   44df3 /usr/lib64/apache2/modules/mod_security2.so
 288868   11520     280  300668   4967c /usr/lib64/apache2/modules/mod_passenger.so

The list is ordered for size of the whole plugin (summed up, not counting padding); the last three positions are definitely unsurprisingly, although it surprises me the sheer size of the two that are not part of Apache itself (and I start to wonder whether they link something in statically that I missed). The fact that the rewrite module was likely the most complex plugin in Apache’s distribution never left me.

As you can see, almost all plugins have vast overhead especially for what concerns the bss segment (all of them have at least 16 bytes used, and that warrants a whole page for them: 4080 bytes wasted each); the data segment is also interesting: only the two external ones have more than a page worth of variables (which also is suspicious to me). When all the plugins are loaded (like they most likely are right now as well on my server) there are at least 100KiB of overhead; just for the sheer fact that these are plugins and thus have their own address space. Might not sound like a lot of overhead indeed, since Apache is requesting so much memory already, especially with Passenger running, but it definitely doesn’t sound like a good thing for embedded systems.

Now I have no doubt that a lot of people like the fact that Apache has all of those as plugins as they can then use the same Apache build across different configurations without risking to have in memory more code and data than it’s actually needed, but is that right? While it’s obvious that it would be impossible to drop the plugin interface from Apache (since it’s used by third-party developers, more on that later), I would be glad if it was possible to build in the modules that come with Apache (given I can already choose which ones to build or not in Gentoo). Of course I also am using Apache with two configurations, and for instance the other one does not use the authentication system for anything, and this one is not using CGI, but is the overhead caused by the rest of modules worth the hassle, given that Apache already has a way to not initialise the unused built-ins?

I named above “third party developers” but I have to say now that it wasn’t really a proper definition, since it’s not just what third parties would do, it might very well be the original developers who might want to make use of plugins to develop separate projects for some (complex) features, and have different release handling altogether. For uses like that, the cost of plugins is often justifiable; and I am definitely not against having a plugin interface in feng. My main beef is when the plugins are created for functions that are part of the basic featureset of a software.

Another unfortunately not uncommon problem with plugins is that the interface might be skewed by bad design, like the case was (and is) for xine: when trying to open a file, it has to pass through all the plugins, so it loads all of them into memory, together with the libraries they depend on, to ask each of them to test the current file; since plugins cannot really be properly unloaded (and it’s not just a xine limitation) the memory will still be used, the libraries will still be mapped into memory (and relocated, causing copy on write, and thus, more memory) and at least half the point of using plugins has gone away (the ability to only load the code that is actually going to be used). Of course you’re left with the chance that an ABI break does not kill the whole program, but just the plugin, but that’s a very little advantage, given the cost involved in plugins handling. And the way xine was designed, it was definitely impossible to have third-party plugins developed properly.

And to finish off, I said before that plugins cannot be cleanly unloaded: the problem is not only that it’s difficult to have proper cleanup functions for plugins themselves (since often the allocated resources are stored within state variables), but also because some libraries (used as dependency) have no cleanup altogether, and they rely (erroneously) on the fact that they won’t be unloaded. And even when they know they could be unloaded, the PulseAudio libraries, for instance, have to remain loaded because there is no proper way to clean up Thread-Local Storage variables (and a re-load would be quite a problem). Which drives away another point of using plugins.

I leave the rest to you.