In the past weeks I’ve been working again on LScube and in particular working toward improving the networking layer of feng. Up until now, we relied heavily on our own library (netembryo) to do almost all of the networking work. While this design looked nice at the start, we reached the point where:
- the original design had feng with multiple threads to handle “background tasks”, and a manually-tailored eventloop, which required us to reinvent the wheel so much that it was definitely not funny; to solve this we’ve gone to use libev, but that requires us to access the file descriptor directly to set the watcher;
- we need to make sure that the UDP socket is ready to send data to before sending RTP and RTCP data, and to do so, we have to go down again on the file descriptor;
- while the three protocols currently supported by feng for RTP transports (UDP, TCP and SCTP) were all abstracted by netembryo, we had to branch out depending on the used protocol way deep inside feng, as the working of the three of them was very different (we had similitudes between UDP and SCTP, and between SCTP and TCP interleaved, but they were not similar enough in any way!); this resulted in double-branching, which the compiler – even with LTO – will have a hard time understanding (since it depends on an attribute of the socket that we knew already;
- the size of objects in memory was quite bigger than it should have been, because we had to keep them around all the needed information for all the protocols, and at the same time we had to allocate on the stack a huge number of identical objects (like the SCTP stream information to provide the right channel to send the RTP and RTCP packets to);
- we’ve been resolving the client’s IP addresses repeatedly every time we wanted to connect the UDP sockets for RTP and RTCP (as well as resolving our own UP address0;
- we couldn’t get proper details about the errors with network operations, nor we could fine-tune those operations, because we were abstracting all of it away.
While I tried fixing this by giving netembryo a better interface; I ended up finding that it was much, much simpler to deal with the networking code within feng without the abstraction; indeed, in no case we needed to go the full path down to what netembryo used to be beforehand; we always ended up short from that, skipping more than a couple of steps. For instance, the only place where we actually go through with the address resolution is the main socket binding code, where we open the port we’ll be listening on. And even there, we don’t need the “flexibility” (faux-flexibility actually) that netembryo gave us before: we don’t need to re-bind an already open socket; we also don’t want to accept one out of a series of possible addresses, we want all of them or none (this is what helps us supporting IPv6 by the way).
The end result is not only a much, much smaller memory footprint (the Sock
structure we used before was at least 440 bytes, while we can stay well behind the 100 bytes per socket right now), but also less dependencies (we integrate all the code within the feng project itself), less source code (which is always good, because it means less bug), tighter handling of SCTP and interleaved transmission, and more optimised code after compilation. Talk about win-win situations.
Unfortunately not everything is positive from what we saw up to now; the fact that we have no independent implementation (yet) of SCTP makes it quite difficult to make sure that our code is not bugged in some funny way; and even during this (almost entire) rewrite of network code, I was able to find a number of bugs, and a few strange situations that I’ll have to deal with right now, spending more than a couple of hours to make sure that it’s not going to break further on. It also shows we need to integrate valgrind within our current white-box testing approach to make sure that our functions don’t leak memory (I found a few by running valgrind manually).
To be honest, most of the things I’ve been doing now are nothing especially difficult for people used to work with Unix networking APIs, but I most definitely am not an expert of those. I’m actually quite interested in the book if it wasn’t that I cannot get it for the Reader easily. So if somebody feels like looking at the code and tell me what I can improve further, I’d be very happy.
One thing we most likely will have to pick up, though, is the SCTP userland code which right now has at least a few bugs regarding the build system, a couple of which we’re working around in the ebuild. So I won’t have time to be bored for a while still…