And this shows just how geek I can be, by using as the title for the post one of the Pokémons whose only natural attack is … Harden!
Somebody made me notice that I’m getting more scrupulous about security lately, writing more often about it and tightening my own systems. I guess this is a good thing, as becoming responsible for this kind of stuff is important for each of us: if Richard Clarke scared us with it we’re now in the midst of an interesting situation with the stuxnet virus, which gained enough attention that even BBC World News talked about it today.
So what am I actually doing for this? Well, beside insisting on fixing packages when there are even possible security issues, which is a general environment solution, I’ve decided to start hardening my systems starting from the main home router.
You might remember my router as I wrote about it before, but to refresh your mind, and explain it to those who didn’t read about it before, my router is not an off-the-shelf blackbox, and neither it is a reflashed off-the-shelf box that runs OpenWRT or similar firmwares. It is, for the most part, a “standard” system. It’s a classic IBM PC-compatible system, with a Celeron D as CPU, 512MB of RAM and, instead of standard HDDs or SDDs, it runs off a good old fashioned CompactFlash card, with a passive adapter to EIDE.
As “firmware” (or in this case we should call it operating system I guess) it always used a pre-built Gentoo; I’m not using binpkgs, I’m rather building the root out of a chroot. Originally, it used a 32-bit system without fortified sources — as of tonight it runs Gentoo Hardened, 64-bit, with PaX and ASLR; full PIE and full SSP enabled. I guess a few explanations for the changes are worth it.
First of all, why 64-bit? As I described it, there is half a gigabyte of RAM, which fits 32-bit just nicely, no need to get over the 4GiB mark; and definitely a network router is not the kind of appliance you expect powerful CPUs to be needed. So why 64-bit? Common sense wants that 64-bit code requires more memory (bigger pointers) and has an increased code size which both increase disk usage and causes cache to be used up earlier. Indeed, at first lance it seems like this does not fall into two of the most common categories for which 64-bit is suggested: databases (for the memory space) and audio/video encoding (for the extra registers and instructions). Well, I’ll add a third category: a security-oriented hardened system of any kind, as long as ASLR is in the mix.
I have written my doubts about ASLR usefulness — well, time passes and one year later I start to see why ASLR can be useful, mostly when you’re dealing with local exploits. For network services, I still maintain that most likely you cannot solve much with ASLR without occasionally restarting them, since less and less of them actually fork one process from another, while most will nowadays prefer using threads to processes for multiprocessing (especially considering the power of modern multicore, multithread systems). But for ASLR to actually be useful you need two things: relocatable code and enough address space to actually randomize the load addresses; the latter is obviously provided by a 64-bit address space (or is it 48-bit?) versus the 32-bit address space x86 provides. Let’s consider a bit the former.
In the post I linked before, you can see that to have ASLR you end up with either having text relocations on all the executables (which are much more memory hungry than standard executables — and collide with another hardened technique) or with Position-Independent Executables (PIE) that are slightly more memory hungry than normal (because of relocations) but also slower because of sacrificing at least one extra register to build PIC. Well, when using x86-64, you’re saved by this problem: PIC is part of the architecture to the point that there isn’t really much to sacrifice when building PIC. So the bottomline is that to use ASLR, 64-bit is a win.
But please, repeat after me: the security enhancement is ASLR, not PIE.
Okay so that covers half of it; what about SSP? Well, Stack Smashing Protection is a great way to … have lots to debug, I’m afraid. While nowadays there should be much fewer bugs, and the wide use of fortified sources caused already a number of issues to be detected even by those not running a hardened compiler, I’m pretty sure sooner or later I’ll hit some bug that nobody hit before, mostly out of my bad karma, or maybe just because I like using things experimental, who knows. At any rate, it also seems to me like the most important protection here; if anything tries to break the stack boundary, kill it before it can become something more serious; if it’s a DoS, well, it’s annoying, but you don’t risk your system to be used as a spambot (and we definitely have enough of those!) — at least for what concerns C code, it does not do any good for bad codebases unfortunately.
Now the two techniques combined require a huge amount of random data, and that data is fetched from the system entropy pool; given that the router is not running with an HDD (which has non-predictable seek times and thus is a source of entropy), has no audio or video devices to use, and has no keyboard/mouse to gather entropy from, it wouldn’t be extremely unlikely to think of a possible entropy depletion attack. Thankfully, I’m using an EntropyKey to solve that problem.
Finally, to be on the safe side, I enabled PaX (which I keep repeating, has a much more meaningful name on the OpenBSD implementation; W^X), which allows for pages of executable code to be marked as read-only, non-writeable, and vice-versa writeable pages are non-executable. This is probably the most important mitigation strategy I can think of. Unfortunately, the Celeron D has no nx bit support (heck, it came way after my first Athlon64 and it lacks such a feature? Shame!) but PaX does not have that much of a hit on a similar system that mostly idles at 2% of CPU usage (even though I can’t seem to get the scaler to work at all).
One thing I had to be wary of is that enabling UDEREF
actually caused my router not to start, reporting memory corruption when init
started.. so if you see a problem like that, give it a try to disable it.
Unfortunately, this only protects me on the LAN side, since the WAN is still handled through a PCI card that is in truth only a glorified Chinese router using a very old 2.4 kernel.. which makes me shiver to think about. Luckily there is no “trusted” path from there to the rest of the LAN. On the other hand if somebody happens to have an ADSL2+ card that can be used with a standard Linux system, with the standard kernel and no extra modules especially, then I’d be plenty grateful.
More details on how I proceeded to configure the router will come in futher posts, this one is long enough on its own!
Hi Diego,have You tried Viking PCI ADSL2+ Card?
As far as I can tell, the Viking card is just the same as what I got just with a much better firmware.. but also costs two or three times what I paid for this one… I wouldn’t find either the final solution I need…Unfortunately all the ADSL2+ cards that would work like I’d want seem to be 2- or 4-ports and thus waaay too expensive for me to buy for home. Maybe I should look in the second hand market…
By “scaler” do you mean to scale the Celeron’s D clock?If so, I must tell you that those Celerons lack Speedstep.
Heh thanks Leon, then I wasn’t doing something wrong. I guess it might not have been the most sane choice at the time then… it was the cheapest though.In a bit I might simply replace this box with a slightly different one possibly less power-hungry… If only they had 64-bit Atom boards with multiple PCI slots :/
Or to be precise “if only they had a decent price in Italy”; I actually found a couple of viable boards but they had scary prices… and most seem to lack EIDE controllers (which allow for the passive usage of CF cards).
The problem with these low power boards is that that the most power is sucked by the on-board graphics cards, even if it is headless.They can enter in low-power mode (if they can at all in linux) only after X has started because that is when the graphics modules are actually loaded.I think KMS fixes this, but all open-source drivers have bad power management anyway.The boards that have “server-class low power graphics” are expensive.
well there is some atom boards with multiple pci(-e) ports eg.http://www.newegg.com/Produ…
“as I wrote about it before” is a link that leads in a circle back to this post. I was trying to find what board you had . I’ve a NIB P4 541 SL8J2 Malay 3.2 socket 775 sitting on my desk but no board I could ship. Don’t know if it will fit.
also a P4 2.4 SL88F both purchased from someone going out of business
Diego,What I’ve been working on is an embedded (gentoo)solution, on either x86 or Arm processors. When my hardware is ready, embedded gentoo is happy, I’d like you to review what I propose for netfilter on an embedded device that sits behind the outside transparent bridge which is also a (gentoo) embedded system. We need a team to make this a commodity for the greater gentoo community circaGNAP:http://www.gentoo.org/proj/…What we really need is leadership, such as what you can provide…..Interested?James
James, feel free to contact me by mail if you need anything, it definitely tickles me…
Hello Flameeyes,I hope this is not seen as resurrecting a fairly old post however, we are working on putting together our own router to handle multiple BGP links. The obvious place to start was with a stripped down version of gentoo/kernel. I tried to gather as much as possible from your posts however, unclear to me is how the final product turned out. I would really appreciate some direction as to how safely strip the kernel and gentoo to bones and build it back up using only the things needed.Kind Regards,Nick.