Service Announcement: Pawsome Players Streaming Week

You may remember I have been irregularly streaming on Twitch some of my FLOSS work, which focused almost exclusively on unpaper over the past few months. Now, unrelated to this, Cats Protection, the British no-profit whose primary goal is to rehome cats in need, launched their Pawsome Players initiative, aiming at raising fund through streaming — video games primarily, but not just.

With this in mind, I decided to join the fun, and will be streaming for the whole week at least an hour every day, and work on more FLOSS projects. You can find the full schedule (as well as donate to the campaign) on Tiltify, and if you want to get reminded of a particular night, you can join the events on the blog’s Facebook page.

In addition to wrapping up the Meson conversion of Unpaper, I’m planning to do a little bit of work on quite a few more other projects:

  • I have a new glucometer I want to reverse engineer, and with that comes an opportunity to see my way of working through this type of work; I’m not as entertaining and deep as Hector, but if you have never looked over the shoulder of a “black box” reverse engineer, I think it might be interesting. The code I’ll be working on is likely usbmon-tools rather than glucometerutils, but there’s a chance that I’ll get so far ahead I’ll actually implement the code.
  • Speaking of reverse engineering, I have a few adapters I designed (and got printed) for my Saleae Logic Pro 16. I have not released the schematics for those yet, but I now have the work approvals to. I should make a repository for them and release them, I’ll do that on stream!
  • I want to make some design changes to my Birch Books, which I’ll discuss on stream. It’s going to be a bit more PCB “designing” (I use quotes here, because I’m far from a designer, and more of a “cobbler together”) which is likely going to be scary for those who do know what they are doing.

I’m also open to the idea of doing some live code-reviews — I did lots of those when working at Google, and while for those I had a lot of specific standards to appeal to, a number of suggestions are nearly universal, and I have done this before where I was pointed at some Python project and gave some general advice of what I see. I’d be willing to take a random project and see what I can notice, if the author is willing!

Also, bonus points for anyone who guesses where the name of the fundraising page from.

So I hope I’ll hear from you on the stream chat (over on Twitch), and that you’ll join me in supporting Cats Protection to help find every kitty a forever home (me and my wife would love to be one of those homes, but it’s not easy when renting in London), and reach the £1985 target.

And you write a streaming server?

One of the things that I have actually to curse my current job for, is me having to deal with Adobe Flash and in particular with the RTMP protocol. The implementation of the RTMP we’re using on our server is provided by the Red5 project — and they are the ones I’m going to write about now.

Last July I’ve spent days and days looking up documentation about Red5 itself, as we couldn’t reach our resident expert, but at the time, their whole website was unavailable, and was just timing out. Yesterday they told me that this was caused by some kind of DDoS, but even if that’s the case, something doesn’t feel right. Especially because, when I came back from VDD12 at the beginning of September, the website was actually reachable, but with the default configuration of a CentOS 5 system, which makes me think more of a hardware failure than a DDoS.

Right now the website is available, but the trac that should host the documentation is unreachable; a different website (Update (2016-07-29): that website is gone, sigh!) has still some documentation but hasn’t been updated for over two years, for the most part. There is also a company behind the project which on their team’s page lists their dogs, among others. Much as I appreciate companies that have a funny side, this is not funny when the project looks almost entirely dead.

But why am I complaining here? Well, what I gathered from the #red5 channel is that they blame the situation to a DDoS on their website and the fact that every time they try to put the wiki back online it goes offline. Uhm, okay…

Now, there are simple ways to handle DDoS in a fairly decent way that don’t require spending two months changing your setup… and in general it seems like very flimsy that this kind of DDoS are keeping going after two months and you can’t get your documentation up. Beside all your user and admin documentation (i.e. anything that is not developer-oriented) is only available on said wiki? Really?

So here I am, trying to figure out what to do with this hot potato of an install, with server software that is, simply put, completely unreliable (software is as reliable and trustworthy as the people who write it, that’s why you can often see what look like “ad hominem” against particular authors’ software — it’s not a fallacy because you have to trust the author if you run the software). I’m honestly not amused.

Wrong abstractions; bad abstractions

In the past weeks I’ve been working again on LScube and in particular working toward improving the networking layer of feng. Up until now, we relied heavily on our own library (netembryo) to do almost all of the networking work. While this design looked nice at the start, we reached the point where:

  • the original design had feng with multiple threads to handle “background tasks”, and a manually-tailored eventloop, which required us to reinvent the wheel so much that it was definitely not funny; to solve this we’ve gone to use libev, but that requires us to access the file descriptor directly to set the watcher;
  • we need to make sure that the UDP socket is ready to send data to before sending RTP and RTCP data, and to do so, we have to go down again on the file descriptor;
  • while the three protocols currently supported by feng for RTP transports (UDP, TCP and SCTP) were all abstracted by netembryo, we had to branch out depending on the used protocol way deep inside feng, as the working of the three of them was very different (we had similitudes between UDP and SCTP, and between SCTP and TCP interleaved, but they were not similar enough in any way!); this resulted in double-branching, which the compiler – even with LTO – will have a hard time understanding (since it depends on an attribute of the socket that we knew already;
  • the size of objects in memory was quite bigger than it should have been, because we had to keep them around all the needed information for all the protocols, and at the same time we had to allocate on the stack a huge number of identical objects (like the SCTP stream information to provide the right channel to send the RTP and RTCP packets to);
  • we’ve been resolving the client’s IP addresses repeatedly every time we wanted to connect the UDP sockets for RTP and RTCP (as well as resolving our own UP address0;
  • we couldn’t get proper details about the errors with network operations, nor we could fine-tune those operations, because we were abstracting all of it away.

While I tried fixing this by giving netembryo a better interface; I ended up finding that it was much, much simpler to deal with the networking code within feng without the abstraction; indeed, in no case we needed to go the full path down to what netembryo used to be beforehand; we always ended up short from that, skipping more than a couple of steps. For instance, the only place where we actually go through with the address resolution is the main socket binding code, where we open the port we’ll be listening on. And even there, we don’t need the “flexibility” (faux-flexibility actually) that netembryo gave us before: we don’t need to re-bind an already open socket; we also don’t want to accept one out of a series of possible addresses, we want all of them or none (this is what helps us supporting IPv6 by the way).

The end result is not only a much, much smaller memory footprint (the Sock structure we used before was at least 440 bytes, while we can stay well behind the 100 bytes per socket right now), but also less dependencies (we integrate all the code within the feng project itself), less source code (which is always good, because it means less bug), tighter handling of SCTP and interleaved transmission, and more optimised code after compilation. Talk about win-win situations.

Unfortunately not everything is positive from what we saw up to now; the fact that we have no independent implementation (yet) of SCTP makes it quite difficult to make sure that our code is not bugged in some funny way; and even during this (almost entire) rewrite of network code, I was able to find a number of bugs, and a few strange situations that I’ll have to deal with right now, spending more than a couple of hours to make sure that it’s not going to break further on. It also shows we need to integrate valgrind within our current white-box testing approach to make sure that our functions don’t leak memory (I found a few by running valgrind manually).

To be honest, most of the things I’ve been doing now are nothing especially difficult for people used to work with Unix networking APIs, but I most definitely am not an expert of those. I’m actually quite interested in the book if it wasn’t that I cannot get it for the Reader easily. So if somebody feels like looking at the code and tell me what I can improve further, I’d be very happy.

One thing we most likely will have to pick up, though, is the SCTP userland code which right now has at least a few bugs regarding the build system, a couple of which we’re working around in the ebuild. So I won’t have time to be bored for a while still…