Before people end up thinking I’ve been hiding, or was eaten by the grues while thoroughly labelling my cables – by the way, a huge thanks goes to kondor6c! By running this cleanup operation I finally understood why Yamato and Raven had problems when power was cut: by mistake the AP bridge connecting me to the rest of the network was not connected behind the UPS – I would just like to quickly point out that the reason why I seem less active than usual is that I’m unfortunately forced to spend a lot of time on a daily job right now, namely writing a final report on LScube which will serve as a basis for the future documentation and, hopefully new website.
Speaking about documentation, I’ve been receiving a number of requests to make Autotools Mythbuster available as a single download — I plan on doing so as soon as I can find enough free time to work on that in a relaxing environment.
You might remember though, that originally intended to work on the guide on something nearing full time under compensation; this was not a matter of simple greed but the fact that, to keep it relevant and up to date, it requires more time than I could afford to give it without something paying my bills. Given that, I plan on adding an offline Flattr barcode on the cover when I’ll make it available in PDF.
If you wish to see this happening sooner rather than later, you can generally resort my priorities by showing your appreciation for the guide — it really doesn’t matter how much your flattr is worth, any bit helps and knowing you do want to see the guide extended and available for download is important to me.
Next up in line on the list of things I have to let you know is that in my Portage overlay you can now find a patched, hacked version of pcsc-lite making the libusb device discovery feign being libhal. The reason is a bit far-fetched, but if you remember my pcsc-lite changes Gentoo is dropping HAL support whenever possible; unfortunately pcsc-lite considers HAL the current discovery and libusb the “legacy” one, with the future moving to libudev as soon as Ludovic has a plan for it. In the interim, libusb should work fine for most situations.. fortunately (or not, depends on the point of view) I hit one of those situations where libhal was the only way to get it working properly.
In particular, after a security device firmware update (which has to happen strictly under Windows, dang!), my laptop – of which I have to review the opinion, the only things not working properly right now are the touchpad, for which work is underway, and the fingerprint reader, that seems to be a lost cause – now exposes the RFID contactless smartcard reader as a CCID device. For those not speaking cryptogeek, this means that the interface is the standard one and pcsc-lite can use it through the usual app-crypt/ccid driver it uses already to access the “usual” chipcards. Unfortunately, since the device is compound and exposes the two CCID interfaces on a single address, it has been working correctly only with libhal, as that was the only way to tell the CCID driver to look for a given interface. My patch works this around, and compound devices are seen correctly; I’m waiting to hear from Ludovic whether it would be safe to apply it in Gentoo for all the other users.
So I’m not gone, just very very very busy. But I’m around by mail mostly.
While LScube is no longer – for me – a paid job project (since I haven’t been officially paid to work on it since May or so), it’s a project that intrigues me well enough to work on it on free time, rather than, say, watch a movie. The reason is probably to find in situations like those in the past few days, when Luca brings up a feature to implement that either no one else has implemented before, or simply needs to be re-implemented from scratch.
Last time I had to come up with a solution to implement something this intricate was when we decide to implement Apple’s own HTTP tunnelling feature whose only other documentation I know of, beside the original content from Apple is the one I wrote for LScube itself.
This time, the challenge Luca shown me is something quite more stimulating because it means breaking new ground that up to now seems to have been almost entirely missing: RTSP recording and secure RTSP.
The recording feature for RTSP has been somewhat specified by the original RFC but it was left quite up in the air as to what it should have been used for. Darwin Streaming Sever has an implementation, but beside the fact that we try not to look at it too closely (to avoid pollination), we don’t agree on the minimum amount of security that it provides as far as we can tell. But let’s start from the top.
The way we need RECORD to work, is to pass feng (the RTSP server) a live stream, which feng will then multiplex to the connected users. Up to now, this task was care of a special protocol between flux (the adaptor) and feng, which also went to a few changes over time: originally it used UDP over localhost, but we had problems with packet loss (no I can’t even begin to wonder why) and in performance; nowadays it uses POSIX message queues, which are supposedly portable and quite fast.
This approach has, objectively, a number of drawbacks. First of all the MQ stuff is definitely not a good idea to use; it’s barely implemented in Linux and FreeBSD, it’s not implemented on Darwin (and for a project whose aim is to replace Darwin Streaming Server this is quite a bad thing) and even though it does work on Linux, its performances aren’t exactly exciting. Also, using POSIX MQs is quite tricky: you cannot simply write an eventloop-based interface with libev because, even though they are descriptors in their implementation, an MQ descriptor does not support the standard poll() interface (nor any of the more advanced ones) so you end up with a thread blocking in reading… and timing out to make sure that flux wasn’t restarted in the mean time, since in that case, a new pair of MQs would have been created.
This situation was also not becoming any friendlier by the fact that, even though the whole team is composed of fewer than five developers, flux and feng are written in two different languages, so the two sides of the protocol implemented over MQs are also entirely separated. And months after I dropped netembryo from feng it is still used within flux, which is the one piece of the whole project that needed it the least, as it uses none of SCTP, TCP and SSL abstractions, and the interfaces of the two couldn’t be more far apart (flux is written in C++ if you didn’t guess, while netembryo is also written in C).
So it’s definitely time to get rid of the middleware and let the producer/adapter (ffmpeg) talk directly with the server/multiplexer (feng). This is where RECORD is necessary: ffmpeg will open a connection to feng via the same RTSP protocol used by the clients, but rather than ask to DESCRIBE a resource, it will ANNOUNCE one; then it will SETUP the server so that it would open RTP/RTCP ports, and start feeding data to the server. Since this means creating new resources and making it available to third parties (the clients) – and eventually creating files on the server’s disk, if we proceed to the full extent of the plan – it is not something that you want just any client to be able to do: we want to make sure that it is authorised to do so — we definitely don’t want to pull a Diaspora anytime soon.
And in a situation like this, paying for a decent UML modeler, shows itself as a pretty good choice. A number of situations became apparent to me only when I went to actually think of the interaction to draw the pretty diagrams that we should show to our funders.
Right now, though, we don’t even support authentication, so I went to look for it; RTSP was designed first to be an HTTP-compatible protocol, so the authentication is not defined in RFC2326, but rather to be defined by HTTP itself; this means that you’ll find it in RFC 2616 instead — yes it is counterintuitive, as the RFC number is higher; the reason is simple though: RFC2616 obsoletes the previous HTTP description in 2068 which is the one that RTSP refers to; we have no reason to refer to the old one knowing that a new one is out, so…
You might know already that HTTP defines at least two common authentication schemes: Basic (totally insecure) and Digest (probably just as insecure, but obfuscated enough). The former simply provides username and password encoded in base64 — do note: encoded, not encrypted; it is still passed in what can be called clear text, and is thus a totally bogus choice to take. On the other hand, Digest support provides enough features and variables to require a very complex analysis of the documentation before I can actually come up with a decent implementation. Since we wanted something more akin to a Proof-of-Concept, but also something safe enough to use in production, we had to find a middle ground.
At this point I had two possible roads in front of me. On one side, implementing SASL looked like a decent solution to have a safe authentication system, while keeping the feng-proper complexity at bay; unfortunately, I still haven’t found any complete documentation on how SASL is supposed to be implemented as part of HTTP; the Wikipedia page I’m linking to also doesn’t list HTTP among the SASL-supported protocols — but I know it is supposed to work, both because Apache has a mod_authn_sasl and because if I recall correctly, Microsoft provides ActiveDirectory-authenticated proxies, and again if my memory serves me correctly, ActiveDirectory uses SASL. The other road looked more complex at first but it promised much with (relatively) little effort: implementing encrypted, secure RTSP, and require using it to use the RECORD functionality.
While it might sound a bit of overkill to require an encrypted secure connection to send data that has to be further streamed down to users, keep in mind that we’re talking only of encrypting the control channel; in most cases you’d be using RTP/RTCP over UDP to send the actual data, and the control stream can even be closed after the RECORD request is sent. We’re not yet talking about implementing secure RTP (which I think was already drafted up or even implement), but of course if you were to use a secure RTSP session and interleaved binary data, you’d be getting encrypted content as well.
At any rate, the problem with this became how to implement the secure RTSP connection; one option would have been using the oldest trick in the book and use two different ports, one for clear-text RTSP and another for the secure version, as is done for HTTP and HTTPS. But I also remembered that this approach has been frown upon and deprecated for quite a bit. Protocols such as SMTP and IMAP now use TLS through a STARTTLS command that replaces a temporarily clear-text connection with an encrypted TLS on; I remember reading about a similar approach for HTTP and indeed, RFC 2817 describes just that. Since RTSP and HTTP are so similar, there should be no difficulty into implementing the same idea over RTSP.
Now in this situation, implementing secure ANNOUNCE support is definitely nicer; you just refuse to accept methods (in write-mode) for the resource unless you’re over a secure TLS connection; then you also ensure that the user authenticated before allowing access. Interestingly, here the Basic authentication, while still bad, is acceptable for an usable proof of concept, and in the future it can easily be extended to authenticate based on client-side certificates without requiring 401 responses.
Once again, I have to say that having translated Software Engineering wasn’t so bad a thing; even though I usually consider UML being just a decent way to produce pretty diagrams to show the people with the gold purse, doing some modelling of the interaction between the producer and feng actually cleared up my mind enough to avoid a number of pitfalls, like doing authentication contextually to the upgrade to TLS, or forgetting about the need to provide two responses when doing the upgrade.
At any rate, for now I’ll leave you with the other two diagrams I have drawn but wouldn’t fit the post by themselves, they’ll end up on the official documentation — and if you wonder what “nq8” stands for, it’s the default extension generated by pngnq: Visual Paradigm produces 8-bit RGB files when exporting to PNG (and the SVG export produces files broken with RSVG, I have to report that to the developers), but I can cut in half the size of the files by quantising them to a palette.
I’m considering the idea of RE part of the design of feng into UML models, and see if that gives me more insight on how to optimise it further, if that works I should probably consider spending more time over UML in the future… too bad that reading O’Reilly PDF books on the reader is so bad, otherwise I would be reading through all of “Learning UML 2.0” rather than skimming through it when I need.
I have noted in my post about debug information that in feng I’m using a debug codepath to help me reduce false positives in valgrind. When I wrote that I looked up an older post, that promised to explain but never explained. An year afterwards, I guess it’s time for me to explain, and later possibly document this on the lscube documentation that I’ve been trying to maintain to document the whole architecture.
The problem: valgrind is an important piece of equipment in the toolbox of a software developer; but as any other tool, it’s also a tool; in the sense that it’ll blindly report the facts, without caring about the intentions the person writing the code had at the time. Forgetting this leads to situation like Debian SA 1571 (the OpenSSL debacle), where an “unitialised value” warning was intended like the wrong thing, where it was actually pretty much intended. At any rate, the problem here is slightly different of course.
One of the most important reasons to be using valgrind is to find memory leak: memory areas that are allocated but never freed properly. This kind of errors can make software either unreliable or unusable in production, thus testing for memory leak for most seasoned developers is among the primary concerns. Unfortunately, as I said, valgrind doesn’t understand the intentions of the developers, and in this context, it cannot discern between memory that leaks (or rather, that is “still reachable” when the program terminates) and memory that is being used until the program stops. Indeed, since the kernel will free all the memory allocated to the process when it ends, it’s a common doing to simply leave it to the kernel to deallocate those structures that are important until the end of the program, such as configuration structures.
But the problem with leaving these structures around is that you either have to ignore the “still reachable” values (which might actually show some real leaks), or you receive a number of false positive introduced by this practice. To remove the false positive, is not too uncommon to free the remaining memory areas before exiting, something like this:
The problem with having code written this way is that even just the calls to free up the resources will cause some overhead, and especially for small fire-and-forget programs, those simple calls can become a nuisance. Depending on the kind of data structures to free, they can actually take quite a bit of time to orderly unwind it. A common alternative solution is to guard the resource-free calls with a debug conditional, of the kind I have written in the other post. Such a solution usually ends up being #ifndef NDEBUG, so that the same macro can get rid of the assertions and the resource-free calls.
This works out quite decently when you have a simple, top-down straight software, but it doesn’t work so well when you have a complex (or chaotic as you prefer) architecture like feng does. In feng, we have a number of structures that are only used by a range of functions, which are themselves constrained within a translation unit. They are, naturally, variables that you’d consider static to the unit (or even static to the function, from time to time, but that’s just a matter of visibility to the compiler, function or unit static does not change a thing). Unfortunately, to make them static to the unit you need an externally-visible function to properly free them up. While that is not excessively bad, it’s still going to require quite a bit of work to jump between the units, just to get some cleaner debug information.
My solution in feng is something I find much cleaner, even though I know some people might well disagree with me. To perform the orderly cleanup of the remaining data structures, rather than having uninit or finalize functions called at the end of main() (which will then require me to properly handle errors in sub-procedures so that they would end up calling the finalisation from main()!), I rely on the presence of the destructor attribute in the compiler. Actually, I check if the compiler supports this not-too-uncommon feature with autoconf, and if it does, and the user required a debug build, I enable the “cleanup destructors”.
Cleanup destructors are simple unit-static functions that are declared with the destructor attribute; the compiler will set them up to be called as part of the _fini code, when the process is cleared up, and that includes both orderly return from main() and exit() or abort(), which is just what I was looking for. Since the function is already within the translation unit, the variables don’t even need to be exported (and that helps the compiler, especially for the case when they are only used within a single function, or at least I sure hope so).
In one case the #ifdef conditional actually switches a variable from being stack-based to be static on the unit (which changes quite a bit the final source code of the project), since the reference to the head of the array for the listening socket is only needed when iterating through them to set them up, or when freeing them; if we don’t free them (non-debug build) we don’t even need to save it.
You might remember that I have been having fun with Ragel in the paste, since Luca proposed using it for LScube projects (which reduced to using it on feng, for what I am concerned). The whole idea sounded overly complicated in the first place, but then we were able to get most, if still not all, kinks nailed.
Right now, feng uses Ragel to parse the combined RTSP/HTTP request line (with a homebrew backtracking parser, which became the most difficult feature to implement), to split the headers into a manageable hash table (converting them from string into enumerated constants at the same time, another kinky task), to convert both the Range and Transport headers for RTSP into lists of structures that can be handled in the code directly, and for splitting and validating URLs provided by the clients.
While the Ranges, Transports and URLs parsing is done directly with Ragel, and without an excessive amount of blunt force, both the parsing of the request line and the tokenizing of header names is “helped” by using a huge, complex and absolutely ludicrous (if I may say so) Rube Goldberg machine using XML and XSLT to produce Ragel files, which then produce C code, that is finally compiled. You might note here that we have at least three sources and four formats: XML → Ragel → C → relocatable object.
The one thing that we aren’t doing with Ragel right now is the configuration file parser. Originally, Luca intended the configuration file for feng to be compatible with the lighttpd one, in particular allowing to include bits and pieces of the two. As it happens, feng really needs much less configuration, right now, than lighttpd ever did, and the only conditional we support is based on the $SOCKET variable to produce something akin to vhost support. The lighttpd compatibility here is taking a huge toll in the size of code needed to actually parse the file, as well as the waste of time and space to parse it in a tree of tokens and then run through them to set the parameters. And this is without launching into the problems caused by actually wanting to use the “lemon” tool.
Unfortunately, neither replacing the parser (for the same format) with Ragel, nor inventing a new format (keeping the same features) to parse with Ragel is looking interesting. The problem is that Ragel provides a very low-level interface for the parsers, and does not suit well the kind of parsing a configuration file needs. There is an higher-level parser based upon Ragel, by the same author, called Kelbt, but it’s barely documented, and seems only to produce C++ code (which you should probably know already how I dislike).
I definitely don’t intend to write my own Ragel-based parser generator like I did for RTSP (and HTTP); I also don’t want to lose support for “variables” (since many paths are created by appending root paths with sub-directories), so I don’t think switching to INI itself would be a good idea. I considered using the C pre-processor to convert an user-editable key-value pair file into a much simpler one (without conditionals, vhosts and variables) to then parse it, but it’s still a nasty approach, as it requires a further pass after editing and depends on the presence of a C preprocessor anyway.
I also don’t intend using XML for the configuration file! While I have used XML in the above-mentioned Goldberg machine, it smells funny to me as well as anyone else. Even though I find those people judging anything using XML as “crap” delusional, I don’t think XML is designed for user-editable configuration files (and I seriously abhor non-user-editable configuration files, such as those provided by Cherokee, to name one).
Mauro, some time ago, suggested me to look into Antlr3, unfortunately at the time I had to stop because of the lack of documentation on how to do stuff with it. I had already enough trouble to get proper information out of the Ragel documentation (which looks like the thesis document it is). Looking up Wikipedia earlier to look for a couple of links for documentation, it seems like the only source of good documentation is a book — since my time to read another technical book right now is a bit limited, and spending €20 just to get the chance to decide whether I want Antlr or not, doesn’t look appealing either.
Anyway, setting aside the Antlr option for a moment, Luca prodded me about one feature I overlooked in Ragel, maybe because it was introduced later than I started working on implementing it: Scanners. The idea of a scanner is that it reduces the work needed to properly assign the priorities to the transitions so that the longest-matching pattern executes the final code. This would be nice, wouldn’t it? Maybe it would even allow us to drop the Goldberg machine, since the original reason to have it was, indeed, the need to get those priorities right without having to change all the code repeatedly until it works properly.
Unfortunately, to use Ragel scanners, I had to change first of all the Goldberg machine, and splitting it further into another file, since the scanners can only be called into, rather than included in the state machine (and having the machine defined in the same file seem to trigger the need to declare all the possible variables at the same time, which is definitely not something I want to do for all state machine). This was bad enough (making the sources more disorganises, and the build system more complex), but after I was able to build the final binary, I also found an increment of over 3KB of executable code, without optimisations turned on (and using state tables, rather than goto statements, an increase in code size is a very bad sign).
Now it is very well possible that the increase in code size gets us a speed-up in parsing (I sincerely doubt so, though), and maybe using alternative state machine types (such as the above-mentioned goto based one) will provide better results still. Unfortunately, assessing this gets tricky. I already tried writing some sort of benchmark of possible tokenizers, mostly because I fear that gperf, lying on its name, is far from perfect and is producing suboptimal code for too many cases (and gperf is used by many many projects, even at a basic system level. The problem with that is that I suck at both benchmarking and profiling, so it started.
Anyway, comments open, if you have any ideas on how I can produce the benchmark, or suggestions about the configuration file parsing and so on, please let me know. Oh and actually there are two further custom file formats (domain-specific languages at all effects) that are currently parsing in the most suboptimal of ways (strcmp()), and we would need at least one more. If we could get one high-level, but still efficient (very efficient if possible) parser generator, we would definitely gain a lot in tem of managing sources.
In the past weeks I’ve been working again on LScube and in particular working toward improving the networking layer of feng. Up until now, we relied heavily on our own library (netembryo) to do almost all of the networking work. While this design looked nice at the start, we reached the point where:
the original design had feng with multiple threads to handle “background tasks”, and a manually-tailored eventloop, which required us to reinvent the wheel so much that it was definitely not funny; to solve this we’ve gone to use libev, but that requires us to access the file descriptor directly to set the watcher;
we need to make sure that the UDP socket is ready to send data to before sending RTP and RTCP data, and to do so, we have to go down again on the file descriptor;
while the three protocols currently supported by feng for RTP transports (UDP, TCP and SCTP) were all abstracted by netembryo, we had to branch out depending on the used protocol way deep inside feng, as the working of the three of them was very different (we had similitudes between UDP and SCTP, and between SCTP and TCP interleaved, but they were not similar enough in any way!); this resulted in double-branching, which the compiler – even with LTO – will have a hard time understanding (since it depends on an attribute of the socket that we knew already;
the size of objects in memory was quite bigger than it should have been, because we had to keep them around all the needed information for all the protocols, and at the same time we had to allocate on the stack a huge number of identical objects (like the SCTP stream information to provide the right channel to send the RTP and RTCP packets to);
we’ve been resolving the client’s IP addresses repeatedly every time we wanted to connect the UDP sockets for RTP and RTCP (as well as resolving our own UP address0;
we couldn’t get proper details about the errors with network operations, nor we could fine-tune those operations, because we were abstracting all of it away.
While I tried fixing this by giving netembryo a better interface; I ended up finding that it was much, much simpler to deal with the networking code within feng without the abstraction; indeed, in no case we needed to go the full path down to what netembryo used to be beforehand; we always ended up short from that, skipping more than a couple of steps. For instance, the only place where we actually go through with the address resolution is the main socket binding code, where we open the port we’ll be listening on. And even there, we don’t need the “flexibility” (faux-flexibility actually) that netembryo gave us before: we don’t need to re-bind an already open socket; we also don’t want to accept one out of a series of possible addresses, we want all of them or none (this is what helps us supporting IPv6 by the way).
The end result is not only a much, much smaller memory footprint (the Sock structure we used before was at least 440 bytes, while we can stay well behind the 100 bytes per socket right now), but also less dependencies (we integrate all the code within the feng project itself), less source code (which is always good, because it means less bug), tighter handling of SCTP and interleaved transmission, and more optimised code after compilation. Talk about win-win situations.
Unfortunately not everything is positive from what we saw up to now; the fact that we have no independent implementation (yet) of SCTP makes it quite difficult to make sure that our code is not bugged in some funny way; and even during this (almost entire) rewrite of network code, I was able to find a number of bugs, and a few strange situations that I’ll have to deal with right now, spending more than a couple of hours to make sure that it’s not going to break further on. It also shows we need to integrate valgrind within our current white-box testing approach to make sure that our functions don’t leak memory (I found a few by running valgrind manually).
To be honest, most of the things I’ve been doing now are nothing especially difficult for people used to work with Unix networking APIs, but I most definitely am not an expert of those. I’m actually quite interested in the book if it wasn’t that I cannot get it for the Reader easily. So if somebody feels like looking at the code and tell me what I can improve further, I’d be very happy.
One thing we most likely will have to pick up, though, is the SCTP userland code which right now has at least a few bugs regarding the build system, a couple of which we’re working around in the ebuild. So I won’t have time to be bored for a while still…
I’m one of the few Free Software activists that actually endorses the use of User-agent header, I’m afraid. The reason for that is that, while in general that header is used to implement various types of policies, it is often used as part of lock-in schemes (sometimes paper-thin lock-ins by the way), and we all agree that lock-ins are never nice. It is a different discussion on whether those lock-ins are something to simply attack, or something to comprehend and accept — I sincerely think that Apple has all the rights to limit the access to their trailers to QuickTime, or at least try to, as they are providing the service, and it’s for them a platform to show their software; on the other hand, BBC and RAI using it to lock-in their public service TV is something nasty!
So basically we have two reasons to use User-agent: policies and statistics. In the former category I also count in the implementation of workarounds of various species. Statistics, are mostly useful to decide on what to focus, policies, can be used for good or evil; lock-ins are generally evil, but you can use policies to improve the quality of the service for users.
One of the most commonly used workarounds applied by using the user agent declarations are related to MSIE missing features; for instance, there is one to handle serving properly the XHTML files through the application/xhtml+xml mime type, which it doesn’t support:
Yes this has one further check that most of the copies of the same check have on Internet; the reason is that I have experimentally noticed that Facebook does not handle XHTML properly; indeed if you attach a link to a webpage that has images, and is served as XHTML, it won’t get you the title nor allow you to choose an image to use for the link. This was true at least up to last December, and I assume the same is true now, and thus why I have that extra line.
In a different situation, feng uses the User-agent field to identify bugged software and implement specific workarounds (such as ignoring the RTSP/1.0 standard, and seek on subsequent PLAY requests without PAUSE).
There are many more things you can do with agent-specific policies, including providing lower-quality images for smartphones, without implementing mobile-specific website vhosts, but I won’t go into deeper details right now.
For what concerns statistics, they usually provide a way for developers and designer to focus on what’s really being used by the targets of their software. Again, some activists dislike this because it shows that it’s not worth considering non-Firefox, non-IE browsers for most websites, and sometimes not even Firefox, but avoiding these extreme cases, statistics are, in the real working world, very important.
Some people feel like being smarter than the average programmer, and want to throw out of place the statistics by saying that they are using “Commodore 64” or “MS-DOS” as operating system. They pretend to defend their privacy, to camouflage among the bad bad Internet. What they are doing, is actually trying to hide on a plane by wearing a balaclava which you might guess is pretty peculiar. In fact, if you try EFF’s Panopticlick you can see that an unique, “novelty” User-agent is actually making you spark among the Internet users. Which means that if you’re trying to hide through a crowd with the balaclava you’re not smarter than anybody, you’re actually stupider than the average.
Oh and by the way, there is no way your faking being Googlebot will work out good for you; on my webserver for instance, you’ll get 403 responses for all your requests… unless your reverse resolution properly forward-confirms to be coming from the googlebot server farm…
Or see this on YouTube — and yes this is quite ironic that we’ve uploaded to YouTube the visualised history of a streaming software stack.
The video you can see here is the GIT history of feng the RTSP streaming server me and Luca are working on for the LScube project, previously founded by the Turin Politechnic, visualised through Gource.
While this video has some insights about feng itself, which I’ll discuss on the mailing list of the project soon enough, I’m using this to bring home another point, one even more important I think. You probably remember my problems with ATI HD4350 video card … well, one of the reasons why I didn’t post this video before, even though Gource has been in tree (thanks to Enrico) for a while already, is that it didn’t work too well on my system.
It might not be too obvious, but the way Gource work is by using SDL (and thus OpenGL) to render the visualisation to screen and to (PPM) images – the video is then produced by FFmpeg that takes the sequence of PPM and encodes it in H.264 with x264; no I’m not going to do this with Theora – so you rely on your OpenGL support to produce good results. When 0.24 was being worked on (last January) the r700 Radeon driver, with KMS, had some trouble, and you’d see a full orange or purple frame from time to time, resulting in a not-too-appealing video. Yesterday I bit the bullet, and after dmesg has shown me a request from the kernel to update my userland, I rebuilt the Radeon driver from GIT, and Mesa from the 7.8 branch…
No crashes, no artefacts on glxgears, and no artefacts on Gource either! As you can see from the video above. This is with kernel 2.6.33 vanilla, Mesa 7.8 GIT and Radeon GIT, all with KMS enabled (and the framebuffers work as well!). Kudos to Dave, and all the developers working on Radeon, this is what I call good Free Software!
Just do yourself a favour, and don’t buy videocards with fans… leaving alone nVidia’s screwup with the drivers, all of them failed on me at some point, passive cards instead seem to work much longer, probably because of the lack of moving parts.
I sincerely don’t remember whether I already discussed about this before or not; if I didn’t, I’ll try to explain here. When developing in C, C++ and other languages that support some kind of objects as type, you usually have two choices for a composited type: transparent or opaque. If the code using the type is able to see the content of the type, it’s a transparent type, if it cannot, it’s an opaque type.
There are though different grades of transparent and opaque types depending on the language and the way they would get implemented; to simplify the topic for this post, I’ll limit myself to the C language (not the C++ language, be warned) and comment about the practises connected to opaque types.
In C, an opaque type is a structure whose content is unknown; this usually is declared in ways such as the following code, in a header:
At that point, the code including the header will have some limitations compared to transparent types; not knowing the object size, you cannot declare objects with that type directly, but you can only deal with pointers, which also means you cannot dereference them or allocate new objects. For this reason, you need to provide functions to access and handle the type itself, including allocation and deallocation of them, and these functions cannot simply be inline functions since they would need to access the content of the type to work.
All in all you can see that the use of opaque types tend to be a big hit for what concerns performance; instead of a direct memory dereference you need always to pass through a function call (note that this seems the same as accessor functions in C++, but those are usually inline functions that will be replaced at compile-time with the dereference anyway); and you might even have to pass through the PLT (Procedure Linking Table) which means further complication to get to the type.
So why should you ever use opaque types? Well they are very useful when you need to export the interface of a library: since you don’t know either the size or the internal ordering of an opaque type, the library can change the opaque type without changing ABI, and thus requiring a rebuild of the software using it. Repeat with me: changing the size of a transparent type, or the order of its content, will break ABI.
And this gets also particularly important when you’d like to reorder some structures, so that you can remove padding holes (with tools like pahole from the dwarves package, see this as well if you want to understand what I mean). For this reason, sometimes you might prefer having slower, opaque types in the interface, instead of faster but riskier transparent types.
Another place where opaque types are definitely helpful is when designing a plugin interface especially for software that was never designed as a library and has, thus, had an in-flux API. Which is one more reason why I don’t think feng is ready for plugins just yet.
Following another lscube problem discussion, here comes another post of the series “why do I prefer one solution over another”, that I started thinking about after I ripped off the plugins interface from the current feng master branch (to be developed separately and brought back to the master once the interface is properly defined, and chosen and so on).
While ripping off the plugin I had to deal with the configuration file parser that we have in feng right now, which was heavily borrowed out of lighttpd. Luca’s idea has been that, since the scope of lighttpd and feng are quite similar, we could share the same configuration file syntax, and even code. Unfortunately, this ended up having a couple of problems: the first is that, as usual for Luca’s idea, they look so much forward that we only implement a small percentage of the right now: vhost, conditionals and all that stuff is something we not even barely start scratching them; the other is that the code used for the parser has changed quite a bit in lighttpd in the mean time; I think the original one was hand-tailored, then they moved to lemon-based parsing (the parser used by SQLite3) and now they are toying with Ragel to rewrite the mess that it’s now.
But in particular why is the current system a mess? Well the first problem is not really in the syntax itself, but rather n the current (lemon-based, I think) implementation: it uses a big table, which is relatively fine, but then it also references its entries with their direct array index, which means any change in the list of entries require to change all the following ones; not the nicest piece of code I had to deal with.
The second problem is also intrinsic in the way the code was written, partly by hand: there are some objects implemented for the configuration file parsing, like high-level pointer arrays and strings, which reinvent what glib already provides. These objects increase the size and complexity of our code with the only good reason that lighttpd does not use glib and thus have a need for them. Decoupling the configuration file code from these objects is non-trivial, and will diverge the code to the point we wouldn’t be able to share it with lighttpd any longer.
Now there is a certain complexity tied to the way we want the configuration file to behave: we want to have conditionals and blocks, and includes, similar to lighttpd and Apache configuration files, like most software that deals with servers and locations. The simplest configuration file formats don’t work very well for this task (for instance I don’t like the way Cherokee configuration files are written; they are designed to work with a web administration package, rather than hand-edited by an human). This kind of solution usually gets solved in two main ways: either by inventing your own syntax or using something like XML for the configuration file.
While XML makes it relatively easy to parse a configuration file (relatively because it’s just moving the issue of complexity; parsing XML is not usually faster than other configuration file syntax, it’s actually slower and much more complex; but generic XML parsers exist already that make the parsing easier for the final software), it definitely makes it obnoxious to read and edit by hand, which is why I don’t think XML is a good format for configuration files.
My favourite format for configuration file is the “good old” INI-like file format; I call it INI-like but I admit I’m not sure if Windows’s INI files were the ones using it first; on the other hand, the same format is nowadays used by so much stuff, including freedesktop’s .desktop files, and there are multiple well-tested parsers for it, including some (very limited, to be honest) support in glib itself.
Now, it’s obviously a no-brainer that the INI format is nowhere near powerful as the language used by Apache configuration files, or by lighttpd, since it bears no conditionals, and only has structure in the sense of sections, variables and values. But it’s exactly in this kind of simplicity that I think it might work well for feng nonetheless, given it works for git as well.
While the lighttpd syntax contains all kind of conditionals that allow setting particular limits based on virtual host, location, and generic headers, it’s quite likely that for feng we can reduce this to a tighter subset of conditions.
What we already have some small predisposition for is per-virtualhost settings, which are handled with an array (right now a single entry) of specific configuration structures, each of which is initialised first with the default value of the configuration and then overridden with the loaded configuration, and then support for some global non-replaceable settings. How does that fare with an INI-like configuration file? Well, to me it’s quite easy to think of it in this terms:
user = feng
group = feng
logdir = /var/log/feng
errorlog = error.log
sctp = on
listen = 0.0.0.0:554 localhost:8554
accesslog = access.log
accesslog_syslog = on
alias = localhost
accesslog = my_access.log
accesslog = /var/log/feng/another_access.log
This obviously is just a quick draft of how it could be achieved and does not mean that it should be achieved this way at all; but it’s just to say that maybe we should be looking into some alternative, even though keeping syntax-compatibility with lighttpd is a nice feature, it shouldn’t condition the extra complexity we’re currently seeing!
I’ve been saying this for quite a while, probably one of the most on-topic post has been written a few months ago but there are some indications about it in posts about xine and other again.
I used to be an enthusiast about plugin interfaces; with time, though, I started having more and more doubts about their actual usefulness — it’s a tract I really much like in myself, I’m fine with reconsidering my own positions over time, deciding that I was wrong; it happened before with other things, like KDE (and C++ in general).
It’s not like I’m totally against the use of plugins altogether. I only think that they are expensive in more ways than one, and that their usefulness is often overstated, or tied to other kind of artificial limitations. For instance, dividing a software’s features over multiple plugins makes it easier for the binary distributions to package them, usually: they only have to ship a package with the main body of the software, and many for the plugins (one per plugin might actually be too much so sometimes they might be grouped). This works out pretty well for both the distribution and, usually, the user: the plugins that are not installed will not bring in extra dependencies, they won’t take time to load and they won’t use memory for either code nor data. It basically allows binary distribution to have a flexibility to compare with Gentoo’s USE flags (and similar options in almost any other source-based distribution).
But as I said this comes with costs, that might or might not be worth it in general. For instance, Luca wanted to implement plugins for feng similarly to what Apache and lighttpd have. I can understand his point: let’s not load code for the stuff we don’t have to deal with, which is more or less the same reason why Apache and lighttpd have modules; in the case of feng, if you don’t care about access log, why should you be loading the access load support at all? I can give you a couple of reasons:
because the complexity of managing a plugin to deal with the access log (or any other similar task) is higher than just having a piece of static code that handles that;
because the overhead of having a plugin loaded just to do that is higher than that of having the static code built in and not enabled into configuration.
The first problem is a result of the way a plugin interface is built: the main body of the software cannot know about its plugins in too specific ways. If the interface is a very generic plugin interface, you add some “hook locations” and then it’s the plugin’s task to find how to do its magic, not the software’s. There are some exceptions to this rule: if you have a plugin interface for handling protocols, like the KIO interface (and I think gvfs has the same) you get the protocol from the URL and call the correct plugin, but even then you’re leaving it to the plugin to deal with doing its magic. You can provide a way for the plugin to tell the main body what it needs and what it can do (like which functions it implements) but even that requires the plugins to be quite autonomous. And that means also being able to take care of allocating and freeing the resources as needed.
The second problem is not only tied to the cost of calling the dynamic linker dynamically to load the plugin and its eventual dependencies (which is a non-trivial amount of work, one has to say), also by the need for having code that deals with finding the modules to load, the loading of those modules, their initialisation, keeping a list of modules to call at any given interface point, and two more points: the PIC problem and the problem of less-than-page-sized segments. This last problem is often ignored, but it’s my main reason to dislike plugins when they are not warranted for other reasons. Given a page size of 4KiB (which is the norm on Linux for what I know), if the code is smaller than that size, it’ll still require a full page (it won’t pack with the rest of the software’s code areas); but at least code is disk-backed (if it’s PIC, of course), it’s worse for what concerns variable data, or variable relocated data, since those are not disk-backed, and it’s not rare that you’d be using a whole page for something like 100 bytes of actual variables.
In the case of the access log module that Luca wrote for feng, the statistics are as such:
flame@yamato feng % size modules/.libs/mod_accesslog.so
text data bss dec hex filename
4792 704 16 5512 1588 modules/.libs/mod_accesslog.so
Which results in two pages (8KiB) for bss and data segments, neither disk-backed, and two disk-backed pages for the executable code (text): 16KiB of addressable memory for a mapping that does not reach 6KiB, it’s a 10KiB overhead, which is much higher than 50%. And that’s the memory overhead alone. The whole overhead, as you might guess at this point, is usually within 12KiB (since you got three segments, and each can have at most one byte less than page size as overhead — it’s actually more complex than this but let’s assume this is true).
It really doesn’t sound like a huge overhead by itself, but you have to always judge it compared to the size of the plugin itself. In the case of feng’s access log, you got a very young plugin that lacks a lot of functionality, so one might say that with the time it’ll be worth it… so I’d like to show you the size statistics for the Apache modules on the very server my blog is hosted. Before doing so, though, I have to remind you one huge difference: feng is built with most optimisations turned off, while Apache is built optimised for size; they are both AMD64 though so the comparison is quite easy.
The list is ordered for size of the whole plugin (summed up, not counting padding); the last three positions are definitely unsurprisingly, although it surprises me the sheer size of the two that are not part of Apache itself (and I start to wonder whether they link something in statically that I missed). The fact that the rewrite module was likely the most complex plugin in Apache’s distribution never left me.
As you can see, almost all plugins have vast overhead especially for what concerns the bss segment (all of them have at least 16 bytes used, and that warrants a whole page for them: 4080 bytes wasted each); the data segment is also interesting: only the two external ones have more than a page worth of variables (which also is suspicious to me). When all the plugins are loaded (like they most likely are right now as well on my server) there are at least 100KiB of overhead; just for the sheer fact that these are plugins and thus have their own address space. Might not sound like a lot of overhead indeed, since Apache is requesting so much memory already, especially with Passenger running, but it definitely doesn’t sound like a good thing for embedded systems.
Now I have no doubt that a lot of people like the fact that Apache has all of those as plugins as they can then use the same Apache build across different configurations without risking to have in memory more code and data than it’s actually needed, but is that right? While it’s obvious that it would be impossible to drop the plugin interface from Apache (since it’s used by third-party developers, more on that later), I would be glad if it was possible to build in the modules that come with Apache (given I can already choose which ones to build or not in Gentoo). Of course I also am using Apache with two configurations, and for instance the other one does not use the authentication system for anything, and this one is not using CGI, but is the overhead caused by the rest of modules worth the hassle, given that Apache already has a way to not initialise the unused built-ins?
I named above “third party developers” but I have to say now that it wasn’t really a proper definition, since it’s not just what third parties would do, it might very well be the original developers who might want to make use of plugins to develop separate projects for some (complex) features, and have different release handling altogether. For uses like that, the cost of plugins is often justifiable; and I am definitely not against having a plugin interface in feng. My main beef is when the plugins are created for functions that are part of the basic featureset of a software.
Another unfortunately not uncommon problem with plugins is that the interface might be skewed by bad design, like the case was (and is) for xine: when trying to open a file, it has to pass through all the plugins, so it loads all of them into memory, together with the libraries they depend on, to ask each of them to test the current file; since plugins cannot really be properly unloaded (and it’s not just a xine limitation) the memory will still be used, the libraries will still be mapped into memory (and relocated, causing copy on write, and thus, more memory) and at least half the point of using plugins has gone away (the ability to only load the code that is actually going to be used). Of course you’re left with the chance that an ABI break does not kill the whole program, but just the plugin, but that’s a very little advantage, given the cost involved in plugins handling. And the way xine was designed, it was definitely impossible to have third-party plugins developed properly.
And to finish off, I said before that plugins cannot be cleanly unloaded: the problem is not only that it’s difficult to have proper cleanup functions for plugins themselves (since often the allocated resources are stored within state variables), but also because some libraries (used as dependency) have no cleanup altogether, and they rely (erroneously) on the fact that they won’t be unloaded. And even when they know they could be unloaded, the PulseAudio libraries, for instance, have to remain loaded because there is no proper way to clean up Thread-Local Storage variables (and a re-load would be quite a problem). Which drives away another point of using plugins.