Runtime vs buildtime

No this time I’m not talking about failures but of parsing.

For feng I’m now working on replacing the main configuration parser with something neater and possibly simpler. Unfortunately, as it happens, most of the parsers out there are actually based on the concept of flex and yacc of splitting the lexing and parsing into two different functionalities. This is all fine and dandy when you have extensible syntax, and all the frills, but I find it a bit cumbersome, as we split the work both at runtime and build time.

This is why for the smaller stuff we’ve been using Ragel: we can embed lexing and parsing in a single action, which is very fast (and thus very good to parse the data that arrives from the client, which we have to answer to right away); unfortunately while generally flexible, me and Luca found that Ragel has its limits — one of these is with recursive definitions, which luckily we don’t need in the context of feng, but which I tried to use for Ruby-Elf to demangle C++ names.

On other hand I found before that Ragel tend to be quite flexible when used as an intermediate language; in feng, to properly parse request lines (and in particular the method name) I use a (bloated, admittedly) Goldberg Machine made of XML, XSLT and Ragel, that translate a simple list of protocols and correlated methods to C code that parse the request lines. This happens because Ragel has no backtracking, and I’d have to parse the same line twice to let it backtrack. Alternatives would be validating the methods against the protocol, but that’s also a difficult thing to do… so for now this will do.

But already when I wanted to replace the custom “sd” format, which was implemented as a series of fgets() and g_ascii_strcasecmp() (which it’s a definite waste because it was lowering the text on both sides every time!) over a loop, Ragel didn’t seem like that much of a good choice. At the end I went with simply using the already-present glib parser for INI-like files, but that didn’t reduce the code as much as I wished, because I still had to look up the configuration lines with strings, from the loaded file structure.

So from one side the GLIB parser is neat enough because it’s extensible: anything that is not known for, but has a valid syntax, is kept in memory so that extended files can be used by older versions of feng without problems; on the other hand, this also means that if the file is huge, nonsense is also loaded into memory. While it does strip comments, it doesn’t strip unknown elements because it doesn’t know which elements are valid.

On the other side, argument parsing is usually doing the other way around: rather than letting the parser understand everything and then ask it whether there are the parameters you’re looking for, you write in the code a table describing the accepted parameters and then the parser compares each option against that table; depending on the format/library/concept used it either sets variables around, or call functions registered as callbacks to tell you that the option was encountered. Unrecognised options can either be ignored or will cause an error.

Special format parsers, such as most JSON parsers, seem to do something similar: they let you register callbacks that are called when specific tokens are detected… the result sincerely is often enough clumsy, to me, and the compiler has no way to understand well what you’re doing. Others such as json-c prefer to load everything into memory, and then you’re up on your own to parse it; if it’s structured data… good luck.

The lemon parser generator that lighttpd (and thus feng, for good or worse) used seems to have produced a combined approach: on one side the configuration file is parsed and loaded into memory and on the other side two long tables are used to scan through the parsed results; this N-pass method, though, is complex though, and still does a lot more work than it should because I cannot report an error to the user on a broken constrain with a line number of where the error was in the first place.

So I’m now considering writing a Ragel parser generator: you describe in some verbose way the format of the configuration file, define actions of what to do when a described configuration variable is found, add support for producing an error to the user if there is a mistake in the configuration variable… this kind of things. What I’m not sure of is the format.

Sincerely, out of experience, the configuration file format that makes more sense for feng is the one used by ISC for DHCP and Bind: it provides for top-level global parameters, then named and unnamed sections. Luca’s original idea was that of using the same syntax as lighttpd to be able to reuse the files and share part of it, but we never came to have as much features as that, and it actually is showing itself to be troublesome; the fact that you use comparison to define vhosts and sockets makes it very difficult to deal with.

Even more so, experience shows me that we need to keep separated the listening interfaces (and their parameters) from the virtual hosts; it gets easier to do since at least RTSP mandates providing the full URL on requests rather than just the local path. What I’m aiming at this point would be something along these lines:

log level debug; # or 5
log syslog;

user feng;
group feng;

socket {
    port 8554;
}

socket {
    port 554;
    family all; # or ipv4 or ipv6
    sctp on;
    sctp streams 16;
}

vhost myhost.tld {
    root /var/lib/feng/myhost;
    log access syslog;
}

vhost yourhost.tld {
    root /var/lib/feng/yourhost;
    log access /var/log/feng/yourlost.tld;
}

Parsers, tokenizers and state machines

You might remember that I have been having fun with Ragel in the paste, since Luca proposed using it for LScube projects (which reduced to using it on feng, for what I am concerned). The whole idea sounded overly complicated in the first place, but then we were able to get most, if still not all, kinks nailed.

Right now, feng uses Ragel to parse the combined RTSP/HTTP request line (with a homebrew backtracking parser, which became the most difficult feature to implement), to split the headers into a manageable hash table (converting them from string into enumerated constants at the same time, another kinky task), to convert both the Range and Transport headers for RTSP into lists of structures that can be handled in the code directly, and for splitting and validating URLs provided by the clients.

While the Ranges, Transports and URLs parsing is done directly with Ragel, and without an excessive amount of blunt force, both the parsing of the request line and the tokenizing of header names is “helped” by using a huge, complex and absolutely ludicrous (if I may say so) Rube Goldberg machine using XML and XSLT to produce Ragel files, which then produce C code, that is finally compiled. You might note here that we have at least three sources and four formats: XML → Ragel → C → relocatable object.

The one thing that we aren’t doing with Ragel right now is the configuration file parser. Originally, Luca intended the configuration file for feng to be compatible with the lighttpd one, in particular allowing to include bits and pieces of the two. As it happens, feng really needs much less configuration, right now, than lighttpd ever did, and the only conditional we support is based on the $SOCKET variable to produce something akin to vhost support. The lighttpd compatibility here is taking a huge toll in the size of code needed to actually parse the file, as well as the waste of time and space to parse it in a tree of tokens and then run through them to set the parameters. And this is without launching into the problems caused by actually wanting to use the “lemon” tool.

Unfortunately, neither replacing the parser (for the same format) with Ragel, nor inventing a new format (keeping the same features) to parse with Ragel is looking interesting. The problem is that Ragel provides a very low-level interface for the parsers, and does not suit well the kind of parsing a configuration file needs. There is an higher-level parser based upon Ragel, by the same author, called Kelbt, but it’s barely documented, and seems only to produce C++ code (which you should probably know already how I dislike).

I definitely don’t intend to write my own Ragel-based parser generator like I did for RTSP (and HTTP); I also don’t want to lose support for “variables” (since many paths are created by appending root paths with sub-directories), so I don’t think switching to INI itself would be a good idea. I considered using the C pre-processor to convert an user-editable key-value pair file into a much simpler one (without conditionals, vhosts and variables) to then parse it, but it’s still a nasty approach, as it requires a further pass after editing and depends on the presence of a C preprocessor anyway.

I also don’t intend using XML for the configuration file! While I have used XML in the above-mentioned Goldberg machine, it smells funny to me as well as anyone else. Even though I find those people judging anything using XML as “crap” delusional, I don’t think XML is designed for user-editable configuration files (and I seriously abhor non-user-editable configuration files, such as those provided by Cherokee, to name one).

Mauro, some time ago, suggested me to look into Antlr3, unfortunately at the time I had to stop because of the lack of documentation on how to do stuff with it. I had already enough trouble to get proper information out of the Ragel documentation (which looks like the thesis document it is). Looking up Wikipedia earlier to look for a couple of links for documentation, it seems like the only source of good documentation is a book — since my time to read another technical book right now is a bit limited, and spending €20 just to get the chance to decide whether I want Antlr or not, doesn’t look appealing either.

Anyway, setting aside the Antlr option for a moment, Luca prodded me about one feature I overlooked in Ragel, maybe because it was introduced later than I started working on implementing it: Scanners. The idea of a scanner is that it reduces the work needed to properly assign the priorities to the transitions so that the longest-matching pattern executes the final code. This would be nice, wouldn’t it? Maybe it would even allow us to drop the Goldberg machine, since the original reason to have it was, indeed, the need to get those priorities right without having to change all the code repeatedly until it works properly.

Unfortunately, to use Ragel scanners, I had to change first of all the Goldberg machine, and splitting it further into another file, since the scanners can only be called into, rather than included in the state machine (and having the machine defined in the same file seem to trigger the need to declare all the possible variables at the same time, which is definitely not something I want to do for all state machine). This was bad enough (making the sources more disorganises, and the build system more complex), but after I was able to build the final binary, I also found an increment of over 3KB of executable code, without optimisations turned on (and using state tables, rather than goto statements, an increase in code size is a very bad sign).

Now it is very well possible that the increase in code size gets us a speed-up in parsing (I sincerely doubt so, though), and maybe using alternative state machine types (such as the above-mentioned goto based one) will provide better results still. Unfortunately, assessing this gets tricky. I already tried writing some sort of benchmark of possible tokenizers, mostly because I fear that gperf, lying on its name, is far from perfect and is producing suboptimal code for too many cases (and gperf is used by many many projects, even at a basic system level. The problem with that is that I suck at both benchmarking and profiling, so it started.

Anyway, comments open, if you have any ideas on how I can produce the benchmark, or suggestions about the configuration file parsing and so on, please let me know. Oh and actually there are two further custom file formats (domain-specific languages at all effects) that are currently parsing in the most suboptimal of ways (strcmp()), and we would need at least one more. If we could get one high-level, but still efficient (very efficient if possible) parser generator, we would definitely gain a lot in tem of managing sources.

HTTP-like protocols have one huge defect

So you might or might not remember that my main paid job in the past months (and right now as well) has been working on feng, the RTSP server component of the lscube stack .

The RTSP protocol is based off HTTP, and indeed uses the same message format as defined by the RFC822 text (the same used for email messages), and a request line “compatible” with HTTP.

Now, it’s interesting to know that this similitude between the two has been used, among other things, by Apple to implement the so-called HTTP tunnelling (see the QuickTime Streaming Server manual Chapter 1 Concepts, section Tunneling RTSP and RTP Over HTTP for the full description of that procedure). This feature allows clients behind standard HTTP proxies to access the stream, creating a virtual full-duplex communication between the two. Pretty neat stuff, even though Apple recently superseded it with the pure HTTP streaming that is implemented in QuickTime X.

For LScube we want to implement at a very least this feature, both server and client side, so that we can get on par with the QuickTime features (implementing the new HTTP-based streaming is part of the long haul TODO, but that’s beside the point now). To do that, our parser has to be able to accept the HTTP request and deal with them appropriately. For this reason, I’ve been working to replace the RTSP-specific parser to a more generic parser that accepts both HTTP and RTSP. Unfortunately, this turned out not to be a very easy task.

The main problem is that what we wanted to do was to do the least passes over the request line to get the data out; when we only supported RTSP/1.0 this was trivial: we knew exactly which method were supported, which ones appeared valid but weren’t supported (like RECORD) and which ones were simply invalid to begin with, so we set the value for the method passing by and then moved on to check the protocol. If the protocol was not valid, we cared not about the method anyway, but at worse we had to pass through a series of states for no good reason, but that wasn’t especially bad.

With the introduction of a simultaneous HTTP parser, the situation became much more complex: the methods are parsed right away, but the two protocols have different methods: the GET method that is supported for HTTP is a valid but not supported method for RTSP, and vice-versa when it comes to the PLAY method. The actions that handled the result of parsing of the method for the two protocols ended up executing simultaneously, if we were to use a simple union of state machines, and that, quite obviously, couldn’t have been the right thing to do.

Now, it’s really simple to understand that what we needed was a way to discern which protocol we’re trying to parse first, and then proceed to parse the rest of the line as needed. But this is exactly what I think is the main issue with the HTTP protocol and all the protocols, like RTSP, or WebDAV, that derive, or extend, it: the protocol specification is at the end of the request line. Since you usually parse a line in the latin order of characters (from left to right), you read the method before you know which protocol the client is speaking. This is easily solved by backtracking parsers (I guess LALR parsers is the correct definition, but parsers aren’t my field of work, usually, so I might be mistaken), since they first pass through the text to parse to identify which syntax to apply, and then they apply the syntax; Ragel is not such a parser, while kelbt (by the same author) is.

Time constrain and the fact that kelbt is even more sparingly documented than Ragel mean that I won’t be trying to use kelbt just yet, and for now I settled at trying to find an overcomplex and nearly unmaintainable workaround to have something working (since the parsing is going to be a black-box function, the implementation can easily change in the future when I learn some decent way to do that).

This all thing would have been definitely simpler if the protocol specification was at the start of the line! At that point we could just have decided the parsing further down the line depending on the protocol.

At this point I’m definitely not surprised that Adobe didn’t use RTSP and instead invented their own Real-Time Message Protocol not based on HTTP but is rather a binary protocol (which should also make it much easier to parse, to an extent).

Multicall binaries

It might not be well known, but there is a feature used by many packages, also in Portage, that is called “multicall binary”. A multicall binary is an executable that runs different code depending on the name used to invoke it. This usually brings the package to install a single executable and then a series of symbolic links with the various alternative names (sometimes called applets), pointing to the one. Two very common examples of this trick are busybox and portage-utils.

The reason why you might want to use multicall binaries is that sometimes you end up having lots of shared code between two programs, and shared libraries are not an option but you still want to avoid wasting space, maybe because the size of the shared code trumps the size of the programs themselves.

It is of course a matter of compromises: the size of the singe program will increase and that means that it might take more memory when running for a long period of time; on the other hand, if you have multiple instances, running for long time, of that family of programs, they will, overall, use less memory, because their shared code will already be loaded in memory. And since they are not using shared objects the PIC effect will be avoided, saving even more space.

How do you implement multicall binaries? Well, it’s not really hard: you just have to implement different entrypoints for the applets (in place of one main() function each), and then, in your main, check the value of argv[0] to call the right one. For completeness, you often also add a generic applet name, with argv[1] to explicitly choose one applet (portage-utils does that when you call the q command).

The only notice you have to take here is that sometimes it is far from trivial to properly transform the name string into a value computable by the C code, and thus you might have an heavy slow-down for that parsing in particular. My suggestion there is to use something like Ragel which makes the translation much faster than a chain of if … else if … statements would.

Tokenizing in Ragel

For feng I’m currently working toward getting the parser(s) rewritten using Ragel the state machine generator. Using Ragel for the parsers usually means having a much faster conversion from the RFC822-based RTSP protocol to something that the program can access and understand directly.

One of the interesting things that I’ve been working on, though, is proper tokenisation of the protocol, for instance converting an header name into an enumerated type. The reason why that’s a problem is that it’s not really straightforward to produce, with Ragel, a state machine that can parse a string into an enumerated code, and still discern between invalid and unsupported values; and this in particular is important for us because we have to either ignore or give a 400-response (bad request) depending on that.

The first obvious way to define a token is this one:

MyToken = "FOO" % { token  = MyTokenFoo; } |
    "BAR" % { token = MyTokenBar; } |
    [A-Z]{3};

This works, but does not distinguish extremely well between invalid and unsupported; yes of course you can check in the code whether the parser reached the first final, but it can be made, in my opinion, more solid by setting the validity at the end of the last transition as well; unfortunately code like the following:

MyToken = "FOO" % { token  = MyTokenFoo; } |
    "BAR" % { token = MyTokenBar; } |
    [A-Z]{3} % { token = MyTokenUnsupported };

Will always hit the unsupported case, because the general rule is a superset of the rest of the rules. To solve this problem, you should use priorities, which are documented in the Ragel documentation, but not really explained. In this case, we must give the priority to entering in a given state so that the known token beat the general token; one way to do that is using this:

MyToken = "FOO" > 2 % { token  = MyTokenFoo; } |
    "BAR" > 2 % { token = MyTokenBar; } |
    [A-Z]{3} > 1 % { token = MyTokenUnsupported };

Higher priority means that the state will be considered before the others and the first one wins. Unfortunately this can mess up when you have multiple state machines, since you risk conflicts between priorities. To solve this, you can use named priorities:

MyToken = "FOO" > (mytoken, 2) % { token  = MyTokenFoo; } |
    "BAR" > (mytoken, 2) % { token = MyTokenBar; } |
    [A-Z]{3} > (mytoken, 1) % { token = MyTokenUnsupported };

At that point the states can have priority one over the other only inside the same namespace. And this solves the parser completely; just make sure your variable is set by default to Invalid, and don’t forget to make sure that you still got to make sure that the parser reached the correct final state.

Profiling Proficiency

This article was originally published on the Axant Technical Blog.

Trying to assess the impact of the Ragel state machine generator on software, I’ve been trying to come down with a quick way to seriously benchmark some simple testcases, making sure that the results are as objective as possible. Unfortunately, I’m not a profiling expert; to keep with the alliteration in the post’s title, I’m not proficient with profilers.

Profiling can be done either internally to the software (manually or with GCC’s help), externally with a profiling software, or in the kernel with a sampling profiler. Respectively, you’d be found using the rdtsc instruction (on x86/amd64), the gprof command, valgrind and one of either sysprof or oprofile. Turns out that almost all these options don’t give you anything useful in truth, and you have to decide whether to get theoretical data, or practical timing, and in both cases, you don’t have a very useful result.

Of that list of profiling software, the one that looked more promising was actually oProfile; especially more interesting since the AMD Family 10h (Barcelona) CPUs supposedly have a way to have accurate reporting of instruction executed, which should combine the precise timing reported by oProfile with the invariant execution profile that valgrind provides. Unfortunately oprofile’s documentation is quite lacking, and I could find nothing to get that “IBS” feature working.

Since oprofile is a sampling profiler, it has a few important points to be noted: it requires kernel support (which in my case required a kernel rebuild because I had profiling support disabled), it requires root to set up and start profiling, through a daemon process, by default it profiles everything running in the system, which on a busy system running a tinderbox might actually be too much. Support for oprofile in Gentoo also doesn’t seem to be perfect, for instance there is no standardised way to start/stop/restart the daemon, which might not sound that bad for most people, but is actually useful because sometimes you forget to have the daemon running and you try to start it time and time again and it doesn’t seem to work as you expect. Also, for some reason it stores the data in /var/lib instead of /var/cache; this wouldn’t be a problem if it wasn’t that if you don’t pay enough attention you can easily see your /var/lib filesystem filling up (remember: it runs at root so it bypasses the root-reserved space).

More worrisome, you’ll never get proper profiling on Gentoo yet, at least for AMD64 systems. The problem is that all sampling profilers (so the same holds true for sysprof too) require frame pointer information; the well-known -fomit-frame-pointer flag, that allows to save precious registers on x86, and used to break debug support, can become a problem, as Mart pointed out to me. The tricky issue is that since a few GCC and GDB versions, the frame pointer is no longer needed to be able to get complete backtraces of processes being debugged; this meant that, for instance on AMD64, the flag is now automatically enabled by -O2 and higher. On some architecture, this is still not a problem to sample-based profiling, but on AMD64 it is. Now, the testcases I had to profile are quite simple and minimal and only call into the C library (and that barely calls the kernel to read the input files), so I only needed to have the C library built with frame pointers to break down the functions; unfortunately this wasn’t as easy as I hoped.

The first problem is that the Gentoo ebuild for glibc does have a glibc-omitfp USE flag; but does not seem to explicitly disable frame pointer omitting, just explicitly enable it. Changing that, thus, didn’t help; adding -fno-omit-frame-pointer also does not work properly because flags are (somewhat obviously) filtered down; adding that flag to the whitelist in flag-o-matic does not help either. I have yet to see it working as it’s supposed to, so for now oprofile is just accumulating dust.

My second choice was to use gprof together with GCC’s profiling support, but that only seems to provide a breakdown in percentage of execution time, which is also not what I was aiming for. The final option was to fall back to valgrind, in particular the callgrind tool. Unfortunately the output of that tool is not human-readable, and you usually end up using software like KCacheGrind to be able to parse it; but since this system is KDE-free I couldn’t rely on that; and also KCacheGrind does not provide an easy way to extract data out of it to compile benchmark tables and tests.

On the other hand, the callgrind_annotate script does provide the needed information, although still in a not exactly software-parsable fashion; indeed to find the actual information I had to look at the main() calltrace and then select the function I’m interested in profiling, that provided me with the instruction count of execution. Unfortunately this does not really help me as much as I was hoping, since it tells me nothing on how much time will it take for the CPU to execute the instruction (which is related to the amount of cache and other stuff like that), but it’s at least something to start with.

I’ll be providing the complete script, benchmark and hopefully results once I can actually have a way to correlate them to real-life situations.

Code reuse and RFC 822 message parsing

If you’re just an user with no knowledge of network protocols you might not think there is any difference between an email, a file downloaded through the web, or a video streamed from a cerntral site. If you have some basic knowledge, you might expect the three to instead have little in common, since they come in three different protocols, IMAP (for most modern email systems, that is), HTTP and (for the sake of what I’m going to say), RTSP. In truth, the three of them have quit a bit in common, represented by RFC 822. A single point of contact between this, and many other, technologies.

The RTSP protocol (commonly used by both Real Networks and Apple, beside being a quite nice open protocol) uses a request/response system based on the HTTP protocol, so the similarity between the two is obvious. And both requests and responses of HTTP and RTSP are almost completely valid messages for the RFC822 specifications; the same used for email messages.

This is osmething htat is indeed very nice because it means that the same code that can be used to parse email messages can be used to parse requests and responses for those two protocols. Unfortunately, it’s easier said than done. Since I’ve been working on feng, I’ve been trying to reduce the amount of specific code that we ship, trying to re-use as much generic code as possible, which is what brought us to use ragel for parsing, and glib for most of the utility functions.

For these reason, I also considered using the gmime library to handle the in-memory representation of the messages, as well as possibly the whole parsing futher on. Unfortunately, when trying to implement it I noticed that in quite a few places I would end up doing more work than needed, duplicating parts of the strings, and freeing them right away, with the gmime library doing the final duplication to save it in the hash table (because both my original parser and gmime end up with a GHashTable object).

For desktop applications, this overhead is not really important, but it really is for a server project like feng, since not only it adds an overhead that can be considerable for the target of hundreds of requests a second that the project aims towards, but also adds one more failure point where the code can abort for out of memory. Unfortunately, Jeffrey Stedfast, the gmime maintainer, is more concerned with the cleanness of the API, and its use on the desktop, than of its micro-optimisation; I understand his point, and I thus think it might be a better choice for me to write my own parser to do what I need.

Since the parser can be a component on its own self that can be reused, I’m also going to make sure that it can sustain a high load of messages to parse. Unfortunately, I have no idea how to properly benchmark the code; I’d sincerely like to compare, after at least a draft work, the performance of gmime’s parser against mine, both in term of memory usage and speed. For the former I would have used the old massif tool from valgrind, but I can’t get myself to work with the new one. And I have no idea how to benchmark the speed of the code. If somebody does know how i could do that, I’d be glad.

Basically, my idea is to make sure that the parser works in two modes, a debug/catchall mode where the full headers are parsed and copied over, and another one where the headers are parsed, but are saved only when they are accepted by a provided function. I haven’t yet put to test my idea, but I guess that the hard work would be done more by the storage than the actual parser, especially considering that the parser is implemented by the ragel state machine generator, which is quite fast by itself. And if not for the speed of the parser itself, it would certainly reduce the amount of memory used, especially during parsing of eventual crafted messages.

Hopefully, given enough time and effort, it might produce a library that can be used as a basis for parsing and generating requests and responses for both RTSP and HTTP, as well as parsing e-mail messages, and other RFC 822 applications (I think, but I’m not sure, that the MSN messenger protocol uses something like that too; I do know that git uses it too though).

Who knows, maybe I’ll resume gitarella next, and write it using ruby-liberis, if that’s going to prove faster than the current alternatives. I sincerely hope so.

Good developers don’t necessarily create good build systems

it’s very common for users to expect that good developers will be good at any part of the development process, so if they wrote a good software that works, they expect the build system is just as good, and that the tests are good, and that the documentation should be good. Increasingly, the last part of this starts to be understood not to be the case; there is a huge amount of very good software that just lacks documentation (cough FFmpeg cough). In truth, none of these parts are always the case.

Being a good developer is a complex matter; one might be a very good coder, and thus write code that just works or that is very optimised, but write it in a way that is so obfuscated that nobody can deal with it but the original coder. One might be a very good interface designer, and write software that is perfectly user friendly, so that the users just love to work with it, but have a backend that just doesn’t do what it should. One might have very clear documentation, but the software is just too badly written to be useful.

Build systems are just the same, writing a proper buildsystem is a complex matter. With proper, I mean a buildsystem that allows to do just anything the user might expect to be able to do with a build system. This does not only mean building the software, but also installing, choosing a different installation prefix, cross-compiling, if it makes sense, and choosing build options so that optional features might or might not be compiled.

I have written many posts about how people try to replace autotools and in my opinion fail; I already express my discontent with CMake, since it replaces autotools with something just as complex, that only seems to be easier because KDE people are being instructed on how to use it (just as they were instructed to wrongly use autotools before). I already said that in my opinion the only reason for which CMake makes sense to be used by KDE is supporting the Microsoft C++ compiler under Windows. But at least CMake, after a not-so-long time, is shaping up as being an acceptable build system, I have to give credit to Kitware for that. SCons and similar failed in a cosmic way.

But what I think is worse is when the developers are not even trying to replace autotools, but just don’t seem to understand them at all. I have said recently that I’m going to look into ragel for both LScube and xine. Ragel is a very nice piece of software, its code is very complex and the results are very good. I’m sure the author is a very good developer. But it’s not a very good build system manager.

The build system used by ragel, both release and trunk, uses autoconf for some basic discovery and to allow the user to choose the prefix, but then it uses custom-tailored makefiles, as well as a custom-made config.h file. So what is the problem with them? Well there are many, some of which are being worked around in the ebuild. For instance, the install phase does not support a DESTDIR variable, which is fundamental for almost any distribution to install the package in a temporary directory before packaging it (or in the case of Gentoo, merging it on the live filesystem); it also uses the CFLAGS variable for compiling C++ sources, which is not nice at all either.

You can guess that all these problems are solved very nicely by using automake instead of custom-tailored makefiles. But it’s not just this; when you check out ragel from the Subversion repository, it needs to be bootstrapped; part of the sources are built starting from ragel itself, but to enable/disable the regeneration of these, it’s not just checking if they are missing or what, you have to edit the configure.ac file to enable/disable checking for the tools needed (ragel, kelbt and gperf). Again, it’s not really a good idea.

I’ve sent a few patches to ragel-users to update the buildsystem; a few are to update the currently-used build system to at least support some basic features used by packagers, but then I was quite tired and decided to go with destroying the old buildsystem and replacing it with proper autotools. Why this? Because I found out that the old build system didn’t support out of tree builds either. And of course they were handling packaging manually, while autotools has a nice make dist and especially make distcheck target to take care of the source packaging step automatically.

I’m waiting to see if the maintainer(s) get in touch with me to see if they are intentioned to merge in the new buildsystem. I sincerely hope so, and if they are going to open up the repository of kelbt (hopefully with Git), I’d be glad to rework the buildsystem of that project too. Hopefully with the new buildsystems building ragel on OpenSolaris will be piece of cake (I’ll write more about what I’m doing with OpenSolaris in the next days).

Porting to Glib

In the past few days I’ve been working to port part of lscube to Glib. The idea is that instead of inventing lists, ways to iterate over them, to find data and so on so forth, using Glib versions should be much easier and also faster as they tend to be optimised.

Interestingly enough, glib does seem to provide almost every feature out there, although I still think the logging facilities lack something.

It is very interesting to finally being able to ditch custom-made data structures for things that were tested already before. Unfortunately, rewriting big parts of the code is, as usual, vulnerable to human mistakes. But it’s fun.

I start to wonder how much duplication of efforts we have in a system, I’m pretty sure there is some at least. xine-lib for instance does not use Glib, and it then re-implements some structures and abstractions that could very well be used from Glib. I’m tempted, once I feel better and I can pay more attention to xine-lib again, to start a new branch and see how it would work to move xine-lib to Glib. Considering that even Qt4 now brings that in, I guess most of the frontends would be already using Glib, so it makes sense.

Actually some of the plugins of xine itself make use of Glib, so it would probably be a good idea to at least try to reduce the duplication by using Glib there too. Things like Glib’s GString would make reply generation for HTTP, RTP/RTSP and MMS quite easier than the current methods, and probably much safer. I would even expect it to be faster but I won’t swear on that just yet.

So this is one more TODO for xine-lib, at this point I guess 1.3 series; it would also be nice to start using Ragel for demuxers and protocol parsers.

I guess I should be playing by now rather than writing blogs about technical stuff or rewriting code to use Glib or ragel, sigh.