During the recent Gentoo mudslinging about libav and FFmpeg, one of the contention points is the fact that FFmpeg boasts more “security fixes” than libav over time. Any security conscious developer would know that assessing the general reliability of a software requires much more than just counting CVEs — as they only get assigned when bugs are reported as security issues by somebody.
I ended up learning this first-hand. In August 2005 I was just fixing a few warnings out of xine-ui, with nothing in mind but cleaning up the build log — that patch ended up in Gentoo, but no new release was made for xine-ui itself. Come April of 2006 and a security researcher marked them as a security issue — we were already covered, for the most part, but other distros weren’t. The bug was fixed upstream, but not released, simply because nobody considered them security issues up to that point. My lesson was that issues that might lead to security problems are always better looked at from a security expert — that’s why I originally started working with ocert for verifying issues within xine.
So which kind of issues are considered security issues? In this case the problem was a format string — this is obvious, as it can theoretically allow, under given conditions, to write to arbitrary memory. The same is true for buffer overflows obviously. But what about unbound reads, which in my experience form the vast majority of crashes out there? I would say that there are two widely different problems with them, which can be categorized as security issues: information disclosure (if the attacker can decide where to read and can get useful information out of said read — such as the current base address for the executable or libraries of the process, which can be used later), and good old crashes — which for security purposes are called DoS: Denial of Service.
Not all DoS are crashes, not all crashes are DoS, though! In particular, you can DoS an app without having it crashing, but rather deadlocking, or otherwise exhausting all of one scarce resource — this is the preferred method for DoS on servers; indeed this is the way the Slowloris attack for Apache worked: it used all the connection handlers and caused the server to not answer legitimate clients; a crash would be much easier to identify and recover from, which is why DoS on servers are rarely full-blown crashes. Crashes cannot realistically be called DoS when they are user-initiated without a third-party intervening. It might sounds silly, and remind of an old joke – “Doctor, doctor, if I do this it hurts!” ”Stop doing that, then!” – but it’s the case: if going to the app’s preferences and clicking something causes the app to crash, then there’s a bug which is a crash but is not a DoS.
This brings us to one of the biggest problem with calling something a DoS: it might be a DoS in one use-case, and not in another — let’s use libav as an example. It’s easy to argue that any crash in the libraries for decoding a stream as a DoS, as it’s common to download a file, and try to play it; said file is the element in the equation that comes from a possible attacker, and anything that can happen due to its decode is a security risk. Is it possible to argue that a crash in an encoding path is a DoS? Well, from a client’s perspective, it’s not common to — it’s still very possible that an attacker can trick you into downloading a file and re-encoding it, but it’s less common a situation, and in my experience, most of the encoding-related crashes are triggered only with a given subset of parameters, which makes it more difficult for an attacker to exploit than a decoder-side DoS. If the crash only happens when using avconv
, also, it’s hard to declare it a DoS taking into consideration that at most, it should crash the encoding process, and that’s about it.
Let’s now turn the table, and instead of being the average user downloading movies from The Pirate Bay, we’re a video streaming service, such as YouTube, Vimeo or the like — but without the right expertise, which means that a DoS on your application is actually a big deal. In this situation, assuming your users control the streams that get encoded, you’re dealing with an input source that is untrusted, which means that you’re vulnerable to both crashes in the decoder and in the encoder as real-world DoS attacks. As you see what earlier required explicit user interaction and was hard to consider a full-blown DoS now gets much more important.
This kind of issues is why languages like Ada were created, and why many people out there insist that higher-level languages like Java, Python and Ruby are more secure than C, thanks to the existence of exceptions for error handling, making it easier to have fail-safe conditions which should solve the problem of DoS — the fact that there are just as many security issues in software written in high-level languages as low-level shows how false that concept is nowadays. Because while it does save from some kind of crashes, it also creates issues by the increase in the sheer area of exposure: the more layers, more code is involved in execution, and that can actually increase the chance for somebody to find an issue in them.
Area of exposure is important also for software like libav: if you enable every possible format under the sun for input and output, you’re enabling a whole lot of code, and you can suffer from a whole lot of vulnerabilities — if you’re targeting a much more reduced audience, like for instance you’re using it on a device that has to output only H.264 and Speex audio, you can easily turn everything else off, and reduce your exposure many times. You can probably see now why even when using libav or ffmpeg as backend, Chrome does not support all the input files that they support; it would just be too difficult to validate all the possible code out there, while it’s feasible to validate a subset of them.
This should have established the terms on what to consider DoS and when — so how do you handle this? Well, the first problem is to identify the crashes; you can either wait for an attack to happen, and react to that, or proactively try to identify crash situations, and obviously the latter is what you should do most of the time. Unfortunately, this requires the use of many different techniques, and none yields a 100% positive result, even the combined results are rarely sufficient to argue that a piece of software is 100% safe from crashes and other possible security issues.
One obvious thing is that you just have to make sure the code is not allowing things that should not happen, like incredibly high values or negative ones. This requires manual work and analysis of code, which is usually handled through code reviews – on the topic there is a nice article by Mozilla’s David Humphrey – at least for what concerns libav. But this by itself is not enough, as many times it’s values that are allowed by the specs, but are not handled properly, that cause the crashes. How to deal with them? A suggestion would be to use fuzzing, which is a technique in which a program is executed receiving, as input, a file that is corrupted starting from a valid one. A few years ago, a round of FFmpeg/VLC bugs were filed after Sam Hocevar released, and started using, his zzuf tool (which should be in Portage, if you want to look at it).
Unfortunately, fuzzing, just like using particular exemplars of attacks in the wild, have one big drawback – one that we could call “zenish” – you can easily forget that you’re looking at a piece of code that is crashing on invalid input, and you just go and resolve that one small issue. Do you remember the calibre security shenanigan ? It’s the same thing: if you only fix the one bit that is crashing on you without looking at the whole situation, an attacker, or a security researcher, can actually just look around and spot the next piece that is going to break on you. This is the one issue that me, Luca and the others in the libav project get vocal about when we’re told that we don’t pay attention to security only because it takes us a little longer to come up with a (proper) fix — well, this, and the fact that most of the CVE that are marked as resolved by FFmpeg we have had no way to verify for ourselves because we weren’t given access to the samples for reproducing the crashes; this changed after the last VDD for at least those coming from Google. If I’m not mistaken, at least one of them ended up with a different, complete fix rather than the partial bandaid put in by our peers at FFmpeg.
Testsuites for valid configurations and valid files are not useful to identify these problems, as those are valid files and should not cause a DoS anyway. On the other hand, just using a completely shot-in-the-dark fuzzing technique like zzuf could or could not help, depending on how much time you can pour to look at the failures. Some years ago, I read an interesting book, Fuzzing: Brute Force Vulnerability Discovery by Sutton, Greene and Amini. It was a very interesting read, although last I checked, the software they pointed to was mostly dead in the water. I should probably get back at it and see if I can find if there are new forks of that software that we can use to help getting there.
It’s also important to note that it’s not just a matter of causing a crash, you need to save the sample that caused the issue, and you need to make sure that it’s actually crashing. Even a “all okay” result might not be actually a pass, as in some cases, a corrupted file could cause a buffer overflow that, in a standard setup, could let the software keep running — hardened, and other tools, make it nicer to deal with that kind of issues at least…