LibreSSL, OpenSSL, collisions and problems

Some time ago, on the gentoo-dev mailing list, there has been an interesting thread on the state of LibreSSL in Gentoo. In particular I repeated some of my previous concerns about ABI and API compatibility, especially when trying to keep both libraries on the same system.

While I hope that the problems I pointed out are well clear to the LibreSSL developers, I thought reiterating them again clearly in a blog post would give them a wider reach and thus hope that they can be addressed. Please feel free to reshare in response to people hand waving the idea that LibreSSL can be either a drop-in, or stand-aside replacement for OpenSSL.

Last year, when I first blogged about LibreSSL, I had to write a further clarification as my post was used to imply that you could just replace the OpenSSL binaries with LibreSSL and be done with it. This is not the case and I won’t even go back there. What I’m concerned about this time is whether you can install the two in the same system, and somehow decide which one you want to use on a per-package basis.

Let’s start with the first question: why would you want to do that? Everybody at this point knows that LibreSSL was forked from the OpenSSL code and started removing code that has been needed unnecessary or even dangerous – a very positive thing, given the amount of compatibility kludges around OpenSSL! – and as such it was a subset of the same interface as its parent, thus there would be no reason to wanting the two libraries on the same system.

But then again, LibreSSL never meant to be considered a drop-in replacement, so they haven’t cared as much for the evolution of OpenSSL, and just proceeded in their own direction; said direction included building a new library, libtls, that implements higher-level abstractions of TLS protocol. This vaguely matches the way NSS (the Netscape-now-Mozilla TLS library) is designed, and I think it makes sense: it reduces the amount of repetition that needs to be coded in multiple parts of the software stack to implement HTTPS for instance, reducing the chance of one of them making a stupid mistake.

Unfortunately, this library was originally tied firmly to LibreSSL and there was no way for it to be usable with OpenSSL — I think this has changed recently as a “portable” build of libtls should be available. Ironically, this wouldn’t have been a problem at all if it wasn’t that LibreSSL is not a superset of OpenSSL, as this is where the core of the issue lies.

By far, this is not the first time a problem like this happens in Open Source software communities: different people will want to implement the same concept in different ways. I like to describe this as software biodiversity and I find it generally a good thing. Having more people looking at the same concept from different angles can improve things substantially, especially in regard to finding safe implementations of network protocols.

But there is a problem when you apply parallel evolution to software: if you fork a project and then evolve it on your own agenda, but keep the same library names and a mostly compatible (thus conflicting) API/ABI, you’re going to make people suffer, whether they are developers, consumers, packagers or users.

LibreSSL, libav, Ghostscript, … there are plenty of examples. Since the features of the projects, their API and most definitely their ABIs are not the same, when you’re building a project on top of any of these (or their originators), you’ll end up at some point making a conscious decision on which one you want to rely on. Sometimes you can do that based only on your technical needs, but in most cases you end up with a compromise based on technical needs, licensing concerns and availability in the ecosystem.

These projects didn’t change the name of their libraries, that way they can be used as drop-rebuild replacement for consumers that keep to the greatest common divisor of the interface, but that also means you can’t easily install two of them in the same system. And since most distributions, with the exception of Gentoo, would not really provide the users with choice of multiple implementations, you end up with either a fractured ecosystem, or one that is very much non-diverse.

So if all distributions decide to standardize on one implementation, that’s what the developers will write for. And this is why OpenSSL will likely to stay the standard for a long while still. Of course in this case it’s not as bad as the situation with libav/ffmpeg, as the base featureset is going to be more or less the same, and the APIs that have been dropped up to now, such as the entropy-gathering daemon interface, have been considered A Bad Idea™ for a while, so there are not going to be OpenSSL-only source projects in the future.

What becomes an issue here is that software is built against OpenSSL right now, and you can’t really change this easily. I’ve been told before that this is not true, because OpenBSD switched, but there is a huge difference between all of the BSDs and your usual Linux distributions: the former have much more control on what they have to support.

In particular, the whole base system is released in a single scoop, and it generally includes all the binary packages you can possibly install. Very few third party software providers release binary packages for OpenBSD, and not many more do for NetBSD or FreeBSD. So as long as you either use the binaries provided by those projects or those built by you on the same system, switching the provider is fairly easy.

When you have to support third-party binaries, then you have a big problem, because a given binary may be built against one provider, but depend on a library that depends on the other. So unless you have full control of your system, with no binary packages at all, you’re going to have to provide the most likely provider — which right now is OpenSSL, for good or bad.

Gentoo Linux is, once again, in a more favourable position than many others. As long as you have a full source stack, you can easily choose your provider without considering its popularity. I have built similar stacks before, and my servers deploy stacks similarly, although I have not tried using LibreSSL for any of them yet. But on the desktop it might be trickier, especially if you want to do things like playing Steam games.

But here’s the harsh reality, even if you were to install the libraries in different directories, and you would provide a USE flag to choose between the two, it is not going to be easy to apply the right constraints between final executables and libraries all the way into the tree.

I’m not sure if I have an answer to balance the ability to just make the old software use the new library and the side-installation. I’m scared that a “solution” that can be found to solve this problem is bundling and you can probably figure out that doing so for software like OpenSSL or LibreSSL is a terrible idea, given how fast you should update in response to a security vulnerability.

Project health, and why it’s important — part of the #shellshock afterwords

Tech media has been all the rage this year with trying to hype everything out there as the end of the Internet of Things or the nail on the coffin of open source. A bunch of opinion pieces I found also tried to imply that open source software is to blame, forgetting that the only reason why the security issues found had been considered so nasty is because we know they are widely used.

First there was Heartbleed with its discoverers deciding to spend time setting up a cool name and logo and website for, rather than ensuring it would be patched before it became widely known. Months later, LastPass still tells me that some of the websites I have passwords on have not changed their certificate. This spawned some interest around OpenSSL at least, including the OpenBSD fork which I’m still not sure is going to stick around or not.

Just few weeks ago a dump of passwords caused major stir as some online news sources kept insisting that Google had been hacked. Similarly, people have been insisting for the longest time that it was only Apple’s fault if the photos of a bunch of celebrities were stolen and published on a bunch of sites — and will probably never be expunged from the Internet’s collective conscience.

And then there is the whole hysteria about shellshock which I already dug into. What I promised on that post is looking at the problem from the angle of the project health.

With the term project health I’m referring to a whole set of issues around an open source software project. It’s something that becomes second nature for a distribution packager/developer, but is not obvious to many, especially because it is not easy to quantify. It’s not a function of the number of commits or committers, the number of mailing lists or the traffic in them. It’s an aura.

That OpenSSL’s project health was terrible was no mystery to anybody. The code base in particular was terribly complicated and cater for corner cases that stopped being relevant years ago, and the LibreSSL developers have found plenty of reasons to be worried. But the fact that the codebase was in such a state, and that the developers don’t care to follow what the distributors do, or review patches properly, was not a surprise. You just need to be reminded of the Debian SSL debacle which dates back to 2008.

In the case of bash, the situation is a bit more complex. The shell is a base component of all GNU systems, and is FSF’s choice of UNIX shell. The fact that the man page states clearly It’s too big and too slow. should tip people off but it doesn’t. And it’s not just a matter of extending the POSIX shell syntax with enough sugar that people take it for a programming language and start using them — but that’s also a big problem that caused this particular issue.

The health of bash was not considered good by anybody involved with it on a distribution level. It certainly was not considered good for me, as I moved to zsh years and years ago, and I have been working for over five years years on getting rid of bashisms in scripts. Indeed, I have been pushing, with Roy and others, for the init scripts in Gentoo to be made completely POSIX shell compatible so that they can run with dash or with busybox — even before I was paid to do so for one of the devices I worked on.

Nowadays, the point is probably moot for many people. I think this is the most obvious positive PR for systemd I can think of: no thinking of shells any more, for the most part. Of course it’s not strictly true, but it does solve most of the problems with bashisms in init scripts. And it should solve the problem of using bash as a programming language, except it doesn’t always, but that’s a topic for a different post.

But why were distributors, and Gentoo devs, so wary about bash, way before this happened? The answer is complicated. While bash is a GNU project and the GNU project is the poster child for Free Software, its management has always been sketchy. There is a single developer – The Maintainer as the GNU website calls him, Chet Ramey – and the sole point of contact for him are the mailing lists. The code is released in dumps: a release tarball on the minor version, then every time a new micro version is to be released, a new patch is posted and distributed. If you’re a Gentoo user, you can notice this as when emerging bash, you’ll see all the patches being applied one on top of the other.

There is no public SCM — yes there is a GIT “repository”, but it’s essentially just an import of a given release tarball, and then each released patch applied on top of it as a commit. Since these patches represent a whole point release, and they may be fixing different bugs, related or not, it’s definitely not as useful has having a repository with the intent clearly showing, so that you can figure out what is being done. Reviewing a proper commit-per-change repository is orders of magnitude easier than reviewing a diff in code dumps.

This is not completely unknown in the GNU sphere, glibc has had a terrible track record as well, and only recently, thanks to lots of combined efforts sanity is being restored. This also includes fixing a bunch of security vulnerabilities found or driven into the ground by my friend Tavis.

But this behaviour is essentially why people like me and other distribution developers have been unhappy with bash for years and years, not the particular vulnerability but the health of the project itself. I have been using zsh for years, even though I had not installed it on all my servers up to now (it’s done now), and I have been pushing for Gentoo to move to /bin/sh being provided by dash for a while, at the same time Debian did it already, and the result is that the vulnerability for them is way less scary.

So yeah, I don’t think it’s happenstance that these issues are being found in projects that are not healthy. And it’s not because they are open source, but rather because they are “open source” in a way that does not help. Yes, bash is open source, but it’s not developed like many other projects in the open but behind closed doors, with only one single leader.

So remember this: be open in your open source project, it makes for better health. And try to get more people than you involved, and review publicly the patches that you’re sent!

LibreSSL: drop-in and ABI leakage

There has been some confusion on my previous post with Bob Beck of LibreSSL on whether I would advocate for using a LibreSSL shared object as a drop-in replacement for an OpenSSL shared object. Let me state this here, boldly: you should never, ever, for no reason, use shared objects from different major/minor OpenSSL versions or implementations (such as LibreSSL) as a drop-in replacement for one another.

The reason is, obviously, that the ABI of these libraries differs, sometimes subtly enought that they may actually load and run, but then perform abysmally insecure operations, as its data structures will have changed, and now instead of reading your random-generated key, you may be reading the master private key. nd in general, for other libraries you may even be calling the wrong set of functions, especially for those written in C++, where the vtable content may be rearranged across versions.

What I was discussing in the previous post was the fact that lots of proprietary software packages, by bundling a version of Curl that depends on the RAND_egd() function, will require either unbundling it, or keeping along a copy of OpenSSL to use for runtime linking. And I think that is a problem that people need to consider now rather than later for a very simple reason.

Even if LibreSSL (or any other reimplementation, for what matters) takes foot as the default implementation for all Linux (and not-Linux) distributions, you’ll never be able to fully forget of OpenSSL: not only if you have proprietary software that you maintain, but also because a huge amount of software (and especially hardware) out there will not be able to update easily. And the fact that LibreSSL is throwing away so much of the OpenSSL clutter also means that it’ll be more difficult to backport fixes — while at the same time I think that a good chunk of the black hattery will focus on OpenSSL, especially if it feels “abandoned”, while most of the users will still be using it somehow.

But putting aside the problem of the direct drop-in incompatibilities, there is one more problem that people need to understand, especially Gentoo users, and most other systems that do not completely rebuild their package set when replacing a library like this. The problem is what I would call “ABI leakage”.

Let’s say you have a general libfoo that uses libssl; it uses a subset of the API that works with both OpenSSL. Now you have a bar program that uses libfoo. If the library is written properly, then it’ll treat all the data structures coming from libssl as opaque, providing no way for bar to call into libssl without depending on the SSL API du jour (and thus putting a direct dependency on libssl for the executable). But it’s very well possible that libfoo is not well-written and actually treats the libssl API as transparent. For instance a common mistake is to use one of the SSL data structures inline (rather than as a pointer) in one of its own public structures.

This situation would be barely fine, as long as the data types for libfoo are also completely opaque, as then it’s only the code for libfoo that relies on the structures, and since you’re rebuilding it anyway (as libssl is not ABI-compatible), you solve your problem. But if we keep assuming a worst-case scenario, then you have bar actually dealing with the data structures, for instance by allocating a sized buffer itself, rather than calling into a proper allocation function from libfoo. And there you have a problem.

Because now the ABI of libfoo is not directly defined by its own code, but also by whichever ABI libssl has! It’s a similar problem as the symbol table used as an ABI proxy: while your software will load and run (for a while), you’re really using a different ABI, as libfoo almost certainly does not change its soname when it’s rebuilt against a newer version of libssl. And that can easily cause crashes and worse (see the note above about dropping in LibreSSL as a replacement for OpenSSL).

Now honestly none of this is specific to LibreSSL. The same is true if you were to try using OpenSSL 1.0 shared objects for software built against OpenSSL 0.9 — which is why I cringed any time I heard people suggesting to use symlink at the time, and it seems like people are giving the same suicidal suggestion now with OpenSSL, according to Bob.

So once again, don’t expect binary-compatibility across different versions of OpenSSL, LibreSSL, or any other implementation of the same API, unless they explicitly aim for that (and LibreSSL definitely doesn’t!)

LibreSSL and the bundled libs hurdle

It was over five years ago that I ranted about the bundling of libraries and what that means for vulnerabilities found in those libraries. The world has, since, not really listened. RubyGems still keep insisting that “vendoring” gems is good, Go explicitly didn’t implement a concept of shared libraries, and let’s not even talk about Docker or OSv and their absolutism in static linking and bundling of the whole operating system, essentially.

It should have been obvious how this can be a problem when Heartbleed came out, bundled copies of OpenSSL would have needed separate updates from the system libraries. I guess lots of enterprise users of such software were saved only by the fact that most of the bundlers ended up using older versions of OpenSSL where heartbeat was not implemented at all.

Now that we’re talking about replacing the OpenSSL libraries with those coming from a different project, we’re going to be hit by both edges of the proprietary software sword: bundling and ABI compatibility, which will make things really interesting for everybody.

If you’ve seen my (short, incomplete) list of RAND_egd() users which I posted yesterday. While the tinderbox from which I took this is out of date and needs cleaning, it is a good starting point to figure out the trends, and as somebody already picked up, the bundling is actually strong.

Software that bundled Curl, or even Python, but then relied on the system copy of OpenSSL, will now be looking for RAND_egd() and thus fail. You could be unbundling these libraries, and then use a proper, patched copy of Curl from the system, where the usage of RAND_egd() has been removed, but then again, this is what I’ve been advocating forever or so. With caveats, in the case of Curl.

But now if the use of RAND_egd() is actually coming from the proprietary bits themselves, you’re stuck and you can’t use the new library: you either need to keep around an old copy of OpenSSL (which may be buggy and expose even more vulnerability) or you need a shim library that only provides ABI compatibility against the new LibreSSL-provided library — I’m still not sure why this particular trick is not employed more often, when the changes to a library are only at the interface level but still implements the same functionality.

Now the good news is that from the list that I produced, at least the egd functions never seemed to be popular among proprietary developers. This is expected as egd was vastly a way to implement the /dev/random semantics for non-Linux systems, while the proprietary software that we deal with, at least in the Linux world, can just accept the existence of the devices themselves. So the only problems have to do with unbundling (or replacing) Curl and possibly the Python SSL module. Doing so is not obvious though, as I see from the list that there are at least two copies of libcurl.so.3 which is the older ABI for Curl — although admittedly one is from the scratchbox SDKs which could just as easily be replaced with something less hacky.

Anyway, my current task is to clean up the tinderbox so that it’s in a working state, after which I plan to do a full build of all the reverse dependencies on OpenSSL, it’s very possible that there are more entries that should be in the list, since it was built with USE=gnutls globally to test for GnuTLS 3.0 when it came out.

LibreSSL is taking a beating, and that’s good

When I read about LibreSSL coming from the OpenBSD developers, my first impression was that it was a stunt. I did not change my impression of it drastically still. While I know at least one quite good OpenBSD developer, my impression of the whole set is still the same: we have different concepts of security, and their idea of “cruft” is completely out there for me. But this is a topic for some other time.

So seeing the amount of scrutiny from other who are, like me, skeptical of the OpenBSD people left on their own is a good news. It keeps them honest, as they say. But it also means that things that wouldn’t be otherwise understood by people not used to Linux don’t get shoved under the rug.

This is not idle musings: I still remember (but can’t find now) an article in which Theo boasted not ever having used Linux. And yet kept insisting that his operating system was clearly superior. I was honestly afraid that the way the fork-not-a-fork project was going to be handled was the same, I’m positively happy to be proven wrong up to now.

I actually have been thrilled to see that finally there is movement to replace the straight access to /dev/random and /dev/urandom: Ted’s patch to implement a getrandom() system call that can be made compatible with OpenBSD’s own getentropy() in user space. And even more I’m happy to see that at least one of the OpenBSD/LibreSSL developers pitching in to help shape the interface.

Dropping out the egd support made me puzzled for a moment, but then I realized that there is no point in using egd to feed the randomness to the process, you just need to feed entropy to the kernel, and let the process get it normally. I have had, unfortunately, quite a bit of experience with entropy-generating daemons, and I wonder if this might be the right time to suggest getting a new multi-source daemon out.

So a I going to just blindly trust the OpenBSD people because “they have a good track record”? No. And to anybody that suggest that you can take over lines and lines of code from someone else’s crypto-related project, remove a bunch of code that you think is useless, and have an immediate result, my request is to please stop working with software altogether.

Security Holes
Copyright © Randall Munroe.

I’m not saying that they would do it on purpose, or that they wouldn’t be trying to do the darndest to make LibreSSL a good replacement for OpenSSL. What I’m saying is that I don’t like the way, and the motives, the project was started from. And I think that a reality check, like the one they already got, was due and a good news.

On my side, once the library gets a bit more mileage I’ll be happy to run the tinderbox against it. For now, I’m re-gaining access to Excelsior after a bad kernel update, and I’ll just go and search with elfgrep for which binaries do use the egd functionalities and need to be patched, I’ll post it on Twitter/G+ once I have it. I know it’s not much, but this is what I can do.