The security snakeoil

I said I’m basically doing the job of a network and system administrator here in LA — although you could say I have been doing that for the past four years as well, with the main difference that in the past I would have only a few Unix systems to administer, and mostly I had to deal with way too many Windows boxes.

Anyway the interesting part of doing this job is having to deal with the strange concepts of security that customers (and previous administrators) have of security. I won’t enter into much details, but I’ll talk about a few of the common snakeoil that is being sold for security.

First let’s start with the mighty DROP — it seems to be common choice for most network administrators to set up their systems to DROP all the packets that are not to be delivered to their end destination. Why is this done, is up to debate; some insist that it’s to not let known to a scanner that there even is a host there — which only makes sense if there is no service at all on the system that needs to be accessible, in which case we’re talking about different issues. Another alternative story is that it makes it more difficult to discern between services that are available and those who aren’t — Peter Benie analyzed this better than I can do here.

So the only effective reason to use DROP is to reduce the amount of packets to process and send back during a DDoS — which might be something. On the other hand, if there is even one service that is open, they’ll just exploit that one for the DDoS, not multiple ports. On the other hand, using the DROP rule makes it harder to diagnose network problems for other administrators that are just trying to do their job. What is my solution? Rate-limit: you’ll start dropping packets after the host starts actually trying to flood you, this way the usual diagnostics just work correctly.

Then there is the idea that you can solve everything by putting it behind a firewall and a VPN. Somehow there is this silly idea floating around that the moment when you add a firewall in front of your cabinet, and then use a VPN to tunnel from your office, everything is, in one move, capable of being either trusted or not trusted. Bonus points when the firewall’s only task is to avoid “netscans” and otherwise just do port whitelisting for a bunch of Linux servers.

The problem starts appearing the moment you’re expecting to be able over 100Mbps … and you end up instead being bottle-necked by a 40Mbps firewall which is not even using packet inspection! And the snakeoil here is expecting that there is a huge difference between a server with only one whitelisted port on the firewall and the same server with the same port whitelisted by iptables — no there isn’t a policy on the outgoing connections, so don’t ask. Of course anything behind the VPN is then handled like a completely trusted network, with shallow password, if any at all, and no second-factor authentication.

Finally there is what I all hyper-security — the same kind of thing that enthuses OpenBSD users: lock everything down as tight as you might. “Hey it’s not difficult, you just spend time once!” I heard it so many times by now… Nobody is saying that it’s a good idea to ignore security issues and known vulnerabilities, but most of the time you have to come up with a good mediation between security and usability. If your security is so strong that you can’t do anything useful, then it’s bad security.

It’s not just about what you can do with the system — especially for servers it’s pretty trivial to lock it down to exactly just what you need and nothing more, so I can understand one willing to do so. The problem is that unless you can afford to re-validate all the checks every time you have to add something new, the chances are that with maintenance, you can easily slip up and forget about something, either breaking the services you run, or breaching security altogether.

Here’s the main catch: if it’s a system that only has one user, and will die with you, tightening it down is easy and can work, if you’re not an idiot — but if you feel smug because others accept compromises for the sake of not being a human single point of failure, stop it because you’re just the kind of scum that causes so many people to say “pfft” to security. You can actually have two very easy examples for it in Windows Vista and Fedora.

The former started asking you to confirm every single operation, to the point that users ended up click “Ok” without reading what it asked

  • Do you want to install the drivers for the printer you just bought? Ok.
  • Do you want to install the operating system updates? Ok.
  • Do you want to install Firefox? Ok.
  • Do you want to upgrade Firefox? Ok.
  • Do you want to install Microsoft Office? Ok.
  • Do you want to update Microsoft Office? Ok.
  • Do you want to install Adobe Reader? Ok.
  • Do you want to install this little game you just downloaded? Ok.
  • Do you want to install this driver that sniffs your network traffic? Ok.

I’ve seen malware actually requiring a confirmation, easy to actually notice if you look at the window, but usually just discarded by clicking “Ok” without looking.

For what concerns Fedora instead — you probably are familiar with the decision on RedHat’s part of enabling SELinux by default on their installs, since a very long time ago. And I tend to agree that it’s a good thing; the problem is that most people don’t understand how to work with SELinux, and lacking simple ways to do what they need to do in simple ways, they just decide to turn SELinux off altogether — this is the same problem with Nagios plugins and sudo where the documentation just tells you to give them sudo access to everything.

I could probably write a lot more about this kind of situations but I don’t think it’s worth my time. I just think that people insisting on hyper-security are detrimental to the overall security of the population, as they scare people away.

PAM surprises

Preparing the testground called tinderbox for GCC 4.5 called for a whole cleanup of the tinderbox, a mass-unmerge of all the packages installed that are not part of the world (because they are not essential to the tinderbox process, or its management, nor are dependencies of those).

Easier said than done, the unmerge of twelve thousands (12000) packages require the good part of a day. And this is keeping 313 packages installed as world/system trees (it doesn’t help that GCC depends on GTK+ and with that the tree gets easily bloated). What I wasn’t expecting, was the fun result after the unmerge completed.

It became apparent midway through the removal that Portage post-remove hooks are not designed to work well with mass-unmerge: each TeXLive package that got unmerged cause the whole fonts cache to be regenerated, and that’s slow; each Python package looked for pyc/pyo files, and each package using mime-types rebuilt the cache (with included pages-long warnings about the chemistry mime types being non-standard).

The first problem appeared in the final part of the unmerge: find didn’t work (it is used by the Python post-removal). Why’s that? Because it automagically links against the SElinux library. The selinux USE flag there only controls the dependency variables, but it makes no effort to disable linking against selinux in the first place. For this reason, all the SElinux-related packages are now masked in our default profiles.

Much nastier was the second problem I hit; since I had rebuilt my host system as well as cleaning up the tinderbox ground, I restart GNOME to make sure that the old libraries were flushed out of memory and replaced with the newly built copies… so I logged off the tinderbox SSH session, then came back, but the tinderbox refused to let me in! Even worse, lxc-console didn’t let me login on the system at all. This is most definitely, bad.

Luckily (or not so, from one side), the ball fell squarely within my area of expertise, and management in Gentoo: PAM. Indeed resetting the PAM chain to permit any login for a limited time I was able to log myself back into the container and find what the problem was. Obviously it was another automatic dependency.

You might or might not know that glibc refuses (last I knew) to support the Blowfish cipher for the crypt() function, while a few distributions have been providing patched implementations to support encoding the system password with this algorithm. Gentoo has not been following here as we try to keep closer to upstream.

But we’re definitely not sleeping about this; it was in Summer 2008 – while going in and out of hospitals – that I have added support for SHA512 hashing to our PAM setup. This is definitely better than the default crypt() algorithm or the previously-used MD5 hashing. And in a similar spirit, I have recently implemented extended hashes on pam-pgsql while doing a work task with that package.

Anyway, given that upstream won’t add blowfish support, but distribution wanted it for different reasons, how do you solve the problem cleanly? Well, the solution has been to invent a new library, called libxcrypt; I sincerely have no idea who’s the actual maintainer of the library, but it is available in Gentoo fetching the source archive from Debian’s mirrors. This library provides an alternative crypt() function that indeed supports the Blowfish cipher.

Now, given that Linux-PAM seems to be mostly maintained by SUSE, and that they have been interested in using stronger algorithms for a long time, it is a logical conclusion to expect that Linux-PAM is ready to work with this library, and that’s the case. The problem is that you have no way to tell it to stop using it, so if it’s found it’ll be used. And if you then unmerge the package, without having preserve-libs enabled, you lose all the chances to properly log in on your system, bad!

Luckily, it wasn’t too difficult to patch in Linux-PAM to allow an easy opt-out, even though I didn’t make it a configure switch but simply tied it to a cache variable: tell the configure that you have no xcrypt.h header and it won’t search for the library (originally it would, causing more fuss than needed, and badly failing at the end of the day). Right now, the two ebuilds of Linux-PAM in tree both disable xcrypt and thus save you from locking yourself out.

What this shows is one organisational problem within Gentoo: different groups (and individuals) aiming at similar targets are not communicating well enough to understand what they are doing one with the other, and this can have nasty effects on all the users out there. Luckily, the xcrypt thing is relatively rare, and preserve-libs makes it mostly (but not entirely) harmless — what happens if you have your password hashed with blowfish but you remove blowfish hashing capabilities, needed for comparing it when you log in? Again, we can feel safer knowing that our default configuration will never use blowfish, and the user mocking with that configuration is left alone, if something as silly as rebuilding PAM without blowfish is done.

Just so you know, my plans for this is to add an xcrypt USE flag for Linux-PAM (sys-libs/pam) and then a blowfish one for the PAM configuration files (sys-auth/pambase) that depends on that. Yes the two are different, because one provides an interface (xcrypt) and the other a feature (blowfish-hashing of the password).

Now, back to work!

PulseAudio and quirks

Seems like even my previous post about PulseAudio got one of the PA-bashers to think I’m a nuisance for their “cause”, whatever that is. For this reason I’d like to try to explain some of the quirks regarding PulseAudio, distributions, quirks and so on. Let’s call this a bit of a backstage analysis of what’s going on about Linux and audio, from somebody that has little vested interested in trying to roll the thing for PulseAudio.

The first problem to address relates to the comments that KDE people find PulseAudio a problem; I guess this has to be decomposed in a series of multiple problems: Lennart is a GTK/GNOME guy, so he obviously provided the original tools for GTK/GNOME. For a while I was interested in writing the equivalents for KDE (3) but I never had the time; now that I also moved to GNOME independently, I sincerely have no intention to write KDE tools for PA… but one has to wonder why nobody in KDE went out of his/her way to try doing this before. It’s not like it had to be part of KDE proper, it would have been okay to be an unofficial standalone application.

There is also another problem: most of the KDE guys who do see problems with PulseAudio are most likely using Phonon with xine-lib backend, configured to use the PulseAudio output plugin. Given I’m the one I wrote most of it originally, I can say that it sucks big time. Unfortunately I have had no time to work on that lately, I hope I might have that time in the future, but the two years I spent between hospitals seriously indebted me to the point I’m doing about 18 hours of work a day on average. For those who do want to use xine-lib with Pulse, I’d like to suggest the long route: set up the ALSA Pulse plugin, and then let xine just use ALSA.

There is of course another problem for KDE: while GNOME historically had no problem with force in dependencies that are Linux-specific or that work most of the time just on Linux (think about HAL adoption for instance), and relied on the actual vendors to do the eventual porting, KDE strives to work most of the time on multiple operating systems, including as of KDE 4 also Mac OS X and Windows. Now you might like this or not, but it’s their choice; and the problem is that while there is some kind of PulseAudio support for Windows, at least OSX is pretty badly shaped (also on my radar).

For what concerns distribution support, it is true that Lennart usually just care about Fedora; you have to accept this as part of the deal given RedHat is – as far as I know at least, Lennart feel free to correct me if I’m wrong – the one vendor paying his bills. Now of course we’d all love to support all the distributions at the same time, but the only way that’s possible is if multiple maintainers do coordinate; I’ve been doing my best to pass all the patches upstream when I’ve added them to Gentoo, and I see Colin Guthrie from Mandriva doing the same. One thing I can “blame” Lennart for (and I told this to him before, too!) is not creating a GIT branch with the cherry-picked patches he applies on the Fedora packaging for us to pick up… and the fact that he doesn’t like neither making releases or leaving access to others to do so.

To be honest, there is little different in this from what other projects do with distributions like Ubuntu when they are paid by Canonical. I think this is obvious, everybody looks at their little garden first. But this is not something that should concern us I guess. Gentoo has been quite out of the loop for what concerns PulseAudio, and I’m sorry, that was mostly my fault. I’m doing my best to let us update as soon as possible, but it’s not just that simple, as I already explained .

Then let me just say something about Lennart’s refusal to support system mode (which is available and advertised in Gentoo since PulseAudio entered the tree): I can’t blame him for that. First, his design for PulseAudio is based on providing something that works for the desktop use case. Something along the lines of Windows’s or OSX’s audio subsystems, neither of which provide anything akin to system mode. And indeed PulseAudio, by design, can handle the same situations, including multi-user setups with fast user switching. The fact that a system mode exists at all is due to the fact that I for one needed something like it on my setup, hacked it around for Gentoo, and then Lennart made my life easier implementing some extra bits on PulseAudio proper, but it was certainly not his idea.

What people complain about usually is the need for an X session (not strictly true, PulseAudio will start just fine in SSH — it would probably be possible to even fix it up so that it would tunnel audio just like you can tunnel X!), and the fact that audio does not continue to work when X exits (also not strictly true, if your audio player is running in screen it would be working just fine; it’s the fact that the media player crashes that makes your audio stop). Additionally people complain about the security problem of wanting to have all the processes to run under the same user, rather than allowing them to be on different users, like mpd.

Well, some complains are valid, other are not: it is true that PulseAudio does not work in multi-seat-multi-user environments, at least not with a single audio device, it is unfortunate and I don’t know if it’ll ever do work in that situation without a system mode. It is also true that running processes as different users for privileges separation does not work without system mode. But both these options are walking quite away from the the desktop design that PulseAudio is implementing; sure they are valid use cases, just like embedded systems (Palm Pre uses PulseAudio if you didn’t notice that before), but they are not what Lennart is interested in himself; at the same time I don’t think he’d be stopping anyone to improve the system mode support for those, as long as it wouldn’t require the desktop setup to make compromises.

Because the idea is, as usual in any software design, the one that you have to take compromises; Lennart wants the best experience for what concern desktop systems, and he compromises that system mode is not part of his plan, and it shouldn’t be hindering him. At the same time, while he does get upset when people ask for support about it, and he wrote why it’s not supported he hasn’t removed it (yet — if I was him, at this point I could have just removed it out of spite!). So colouring him as the master of evil does not seem the very best idea — and especially that makes me picture him in the part of Warren in the Trio, from Buffy’s season six.

Oh and a final note: it doesn’t have to surprise that Lennart and Fedora don’t care about running mpd and other services as different users, there are probably quite a few reasons for this. I cannot speak for Fedora, given I’m not involved in it, but my suppositions are that firstly the ALSA dmix plugin is somewhat scary from a security point of view (for me too) because it uses shared memory between processes from different users to do the mixing, and the second is that Fedora does a lot to use SElinux even on standard desktops. This is much tighter than separating privileges with different users since it forces the processes to behave as instructed. Unfortunately on Gentoo the SElinux support seems to have gone for good, at least to me.