What does #shellshock mean for Gentoo?

Gentoo Penguins with chicks at Jougla Point, Antarctica
Photo credit: Liam Quinn

This is going to be interesting as Planet Gentoo is currently unavailable as I write this. I’ll try to send this out further so that people know about it.

By now we have all been doing our best to update our laptops and servers to the new bash version so that we are safe from the big scare of the quarter, shellshock. I say laptop because the way the vulnerability can be exploited limits the impact considerably if you have a desktop or otherwise connect only to trusted networks.

What remains to be done is to figure out how to avoid this repeats. And that’s a difficult topic, because a 25 years old bug is not easy to avoid, especially because there are probably plenty of siblings of it around, that we have not found yet, just like this last week. But there are things that we can do as a whole environment to reduce the chances of problems like this to either happen or at least avoid that they escalate so quickly.

In this post I want to look into some things that Gentoo and its developers can do to make things better.

The first obvious thing is to figure out why /bin/sh for Gentoo is not dash or any other very limited shell such as BusyBox. The main answer lies in the init scripts that still use bashisms; this is not news, as I’ve pushed for that four years ago, while Roy insisted on it even before that. Interestingly enough, though, this excuse is getting less and less relevant thanks to systemd. It is indeed, among all the reasons, one I find very much good in Lennart’s design: we want declarative init systems, not imperative ones. Unfortunately, even systemd is not as declarative as it was originally supposed to be, so the init script problem is half unsolved — on the other hand, it does make things much easier, as you have to start afresh anyway.

If either all your init scripts are non-bash-requiring or you’re using systemd (like me on the laptops), then it’s mostly safe to switch to use dash as the provider for /bin/sh:

# emerge eselect-sh
# eselect sh set dash

That will change your /bin/sh and make it much less likely that you’d be vulnerable to this particular problem. Unfortunately as I said it’s mostly safe. I even found that some of the init scripts I wrote, that I checked with checkbashisms did not work as intended with dash, fixes are on their way. I also found that the lsb_release command, while not requiring bash itself, uses non-POSIX features, resulting in garbage on the output — this breaks facter-2 but not facter-1, I found out when it broke my Puppet setup.

Interestingly it would be simpler for me to use zsh, as then both the init script and lsb_release would have worked. Unfortunately when I tried doing that, Emacs tramp-mode froze when trying to open files, both with sshx and sudo modes. The same was true for using BusyBox, so I decided to just install dash everywhere and use that.

Unfortunately it does not mean you’ll be perfectly safe or that you can remove bash from your system. Especially in Gentoo, we have too many dependencies on it, the first being Portage of course, but eselect also qualifies. Of the two I’m actually more concerned about eselect: I have been saying this from the start, but designing such a major piece of software – that does not change that often – in bash sounds like insanity. I still think that is the case.

I think this is the main problem: in Gentoo especially, bash has always been considered a programming language. That’s bad. Not only because it only has one reference implementation, but it also seem to convince other people, new to coding, that it’s a good engineering practice. It is not. If you need to build something like eselect, you do it in Python, or Perl, or C, but not bash!

Gentoo is currently stagnating, and that’s hard to deny. I’ve stopped being active since I finally accepted stable employment – I’m almost thirty, it was time to stop playing around, I needed to make a living, even if I don’t really make a life – and QA has obviously taken a step back (I still have a non-working dev-python/imaging on my laptop). So trying to push for getting rid of bash in Gentoo altogether is not a good deal. On the other hand, even though it’s going to be probably too late to be relevant, I’ll push for having a Summer of Code next year to convert eselect to Python or something along those lines.

Myself, I decided that the current bashisms in the init scripts I rely upon on my servers are simple enough that dash will work, so I pushed that through puppet to all my servers. It should be enough, for the moment. I expect more scrutiny to be spent on dash, zsh, ksh and the other shells in the next few months as people migrate around, or decide that a 25 years old bug is enough to think twice about all of them, o I’ll keep my options open.

This is actually why I like software biodiversity: it allows to have options to select different options when one components fail, and that is what worries me the most with systemd right now. I also hope that showing how bad bash has been all this time with its closed development will make it possible to have a better syntax-compatible shell with a proper parser, even better with a proper librarised implementation. But that’s probably hoping too much.

Project health, and why it’s important — part of the #shellshock afterwords

Tech media has been all the rage this year with trying to hype everything out there as the end of the Internet of Things or the nail on the coffin of open source. A bunch of opinion pieces I found also tried to imply that open source software is to blame, forgetting that the only reason why the security issues found had been considered so nasty is because we know they are widely used.

First there was Heartbleed with its discoverers deciding to spend time setting up a cool name and logo and website for, rather than ensuring it would be patched before it became widely known. Months later, LastPass still tells me that some of the websites I have passwords on have not changed their certificate. This spawned some interest around OpenSSL at least, including the OpenBSD fork which I’m still not sure is going to stick around or not.

Just few weeks ago a dump of passwords caused major stir as some online news sources kept insisting that Google had been hacked. Similarly, people have been insisting for the longest time that it was only Apple’s fault if the photos of a bunch of celebrities were stolen and published on a bunch of sites — and will probably never be expunged from the Internet’s collective conscience.

And then there is the whole hysteria about shellshock which I already dug into. What I promised on that post is looking at the problem from the angle of the project health.

With the term project health I’m referring to a whole set of issues around an open source software project. It’s something that becomes second nature for a distribution packager/developer, but is not obvious to many, especially because it is not easy to quantify. It’s not a function of the number of commits or committers, the number of mailing lists or the traffic in them. It’s an aura.

That OpenSSL’s project health was terrible was no mystery to anybody. The code base in particular was terribly complicated and cater for corner cases that stopped being relevant years ago, and the LibreSSL developers have found plenty of reasons to be worried. But the fact that the codebase was in such a state, and that the developers don’t care to follow what the distributors do, or review patches properly, was not a surprise. You just need to be reminded of the Debian SSL debacle which dates back to 2008.

In the case of bash, the situation is a bit more complex. The shell is a base component of all GNU systems, and is FSF’s choice of UNIX shell. The fact that the man page states clearly It’s too big and too slow. should tip people off but it doesn’t. And it’s not just a matter of extending the POSIX shell syntax with enough sugar that people take it for a programming language and start using them — but that’s also a big problem that caused this particular issue.

The health of bash was not considered good by anybody involved with it on a distribution level. It certainly was not considered good for me, as I moved to zsh years and years ago, and I have been working for over five years years on getting rid of bashisms in scripts. Indeed, I have been pushing, with Roy and others, for the init scripts in Gentoo to be made completely POSIX shell compatible so that they can run with dash or with busybox — even before I was paid to do so for one of the devices I worked on.

Nowadays, the point is probably moot for many people. I think this is the most obvious positive PR for systemd I can think of: no thinking of shells any more, for the most part. Of course it’s not strictly true, but it does solve most of the problems with bashisms in init scripts. And it should solve the problem of using bash as a programming language, except it doesn’t always, but that’s a topic for a different post.

But why were distributors, and Gentoo devs, so wary about bash, way before this happened? The answer is complicated. While bash is a GNU project and the GNU project is the poster child for Free Software, its management has always been sketchy. There is a single developer – The Maintainer as the GNU website calls him, Chet Ramey – and the sole point of contact for him are the mailing lists. The code is released in dumps: a release tarball on the minor version, then every time a new micro version is to be released, a new patch is posted and distributed. If you’re a Gentoo user, you can notice this as when emerging bash, you’ll see all the patches being applied one on top of the other.

There is no public SCM — yes there is a GIT “repository”, but it’s essentially just an import of a given release tarball, and then each released patch applied on top of it as a commit. Since these patches represent a whole point release, and they may be fixing different bugs, related or not, it’s definitely not as useful has having a repository with the intent clearly showing, so that you can figure out what is being done. Reviewing a proper commit-per-change repository is orders of magnitude easier than reviewing a diff in code dumps.

This is not completely unknown in the GNU sphere, glibc has had a terrible track record as well, and only recently, thanks to lots of combined efforts sanity is being restored. This also includes fixing a bunch of security vulnerabilities found or driven into the ground by my friend Tavis.

But this behaviour is essentially why people like me and other distribution developers have been unhappy with bash for years and years, not the particular vulnerability but the health of the project itself. I have been using zsh for years, even though I had not installed it on all my servers up to now (it’s done now), and I have been pushing for Gentoo to move to /bin/sh being provided by dash for a while, at the same time Debian did it already, and the result is that the vulnerability for them is way less scary.

So yeah, I don’t think it’s happenstance that these issues are being found in projects that are not healthy. And it’s not because they are open source, but rather because they are “open source” in a way that does not help. Yes, bash is open source, but it’s not developed like many other projects in the open but behind closed doors, with only one single leader.

So remember this: be open in your open source project, it makes for better health. And try to get more people than you involved, and review publicly the patches that you’re sent!

Limiting the #shellshock fear

Today’s news all over the place has to do with the nasty bash vulnerability that has been disclosed and now makes everybody go insane. But around this, there’s more buzz than actual fire. The problem I think is that there are a number of claims around this vulnerability that are true all by themselves, but become hysteria when mashed together. Tip of the hat to SANS that tried to calm down the situation as well.

Yes, the bug is nasty, and yes the bug can lead to remote code execution; but not all the servers in the world are vulnerable. First of all, not all the UNIX systems out there use bash at all: the BSDs don’t have bash installed by default, for instance, and both Debian and Ubuntu have been defaulting to dash for their default shell in years. This is important because the mere presence of bash does not make a system vulnerable. To be exploitable, on the server side, you need at least one of two things: a bash-based CGI script, or /bin/sh being bash. In the former case it becomes obvious: you pass down the CGI variables with the exploit and you have direct remote code execution. In the latter, things are a tad less obvious, and rely on the way system() is implemented in C and other languages: it invokes /bin/sh -c {thestring}.

Using system() is already a red flag for me in lots of server-side software: input sanitization is essential in that situation, as otherwise passing user-provided strings in a system() call makes it trivial to implement remote code execution, think of a software using system("convert %s %s-thumb.png") with an user provided string, and let the user provide ; rm -rf / ; as their input… can you see the problem? But with this particular bash bug, you don’t need user-supplied strings to be passed to system(), the mere call will cause the environment to be copied over and thus the code executed. But this relies on /bin/sh to be bash, which is not the case for BSDs, Debian, Ubuntu and a bunch of other situations. But this also requires for the user to be able to change the environment variable.

This does not mean that there is absolutely no risk for Debian or Ubuntu users (or even FreeBSD, but that’s a different problem): if you control an environment variable, and somehow the web application invokes (even indirectly) a bash script (through system() or otherwise), then you’re also vulnerable. This can be the case if the invoked script has #!/bin/bash explicitly in it. Funnily enough, this is how most clients are vulnerable to this problem: the ISC DHCP client dhclient uses a helper script called dhclient-script to set some special options it receives from the server; at least in the Debian/Ubuntu packages of it, the script uses #!/bin/bash explicitly, making those systems vulnerable even if their default shell is not bash.

But who seriously uses CGI nowadays in production? Turns out that a bunch of people using WordPress do, to run PHP — and I”m sure there are scripts using system(). If this is a nail on the coffin of something, my opinion is that it should be on the coffin of the self-hosting mantra that people still insist on.

On the other hand, the focus of the tech field right now is CGI running in small devices, like routers, TVs, and so on so forth. It is indeed the case that in the majority of those devices implement their web interfaces through CGI, because it’s simple and proven, and does not require complex web servers such as Apache. This is what scared plenty of tech people, but it’s a scare that has not been researched properly either. While it’s true that most of my small devices use CGI, I don’t think any of them uses bash. In the embedded world, the majority of the people out there wouldn’t go near bash with a 10’ pole: it’s slow, it’s big, and it’s clunky. If you’re building an embedded image, you probably have already busybox around, and you may as well use it as your shell. It also allows you to use the in-process version of most commands without requiring a full fork.

It’s easy to see how you go from A to Z here: “bash makes CGI vulnerable”, “nearly all embedded devices use CGI” become “bash makes nearly all embedded devices vulnerable”. But that’s not true, as SANS points out, it’s a minimal part of the devices that is actually vulnerable to this attack. Which does not mean the attack is irrelevant. It’s important, and it should tell us many things.

I’ll be writing again regarding “project health” and talking a bit more about bash as a project. In the mean time, make sure you update, don’t believe the first news of “all fixed” (as Tavis pointed out that the first fix was not thought-out properly) and make sure you don’t self-host the stuff you want to keep out of the cloud in a server you upgrade once an year.