Another personal postmortem: Heartbleed

So I promised I would explain why it took me that long to get the Heartbleed problem sorted out. So here’s what was going on.

Early last month, the company I worked for in Los Angeles have been vacating their cabinet at the Los Angeles hosting facility where Excelsior was also hosted. While they have been offering to continue hosting the server, the fact that I could not remotely log into it with the KVM was making it quite difficult for me to update the kernel and to keep updating the LXC ebuild.

I decided to bite the bullet, and enquired for hosting at the same (new) facility of them, Hurricane Electric. While they have very cheap half-rack hosting, I needed to get a full cabinet to host Excelsior on. At $600/mo is not cheap, but I can (barely) afford it right now, the positive side of being an absolute inept socially, and possibly I’ll be able to share it with someone else pretty soon.

The bright side with using Hurricane is that I can rely on nigh-infinite public addresses (IPv6), which is handy for a server like Excelsior that runs basically a farm of virtual machines. The problem is that you don’t only need a server, you also need a switch and a VPN endpoint if you were to put a KVM in there. That’s what I did, I bought a Netgear switch, a Netgear VPN router, and installed the whole thing while I was in the Bay Area for a conference (Percona Live Conference if you’re curious).

Unfortunately, Heartbleed was announced in-between the server being shut down at the previous DC and it being fully operational in the new one — in particular it was still in Los Angeles, but turned down. How does that matter? Well, the vservers where this blog, the xine Bugzilla and other websites run are not powerful enough to build packages for Gentoo, so what I’ve been doing instead is building packages from Excelsior and uploading them to the vservers. This meant that to update OpenSSL, I needed Excelsior running.

Now to get Excelsior running, I spent a full weekend, having to go to the datacenter twice: I couldn’t get the router to work, and after some time my colleague who was kindly driving me there figured out that somehow the switch did not like for port 0 (or 1, depending how you count) on that switch to be set on a VLAN that is not the default, so connecting the router to any other port made it work as expected. I’m still not sure why that is the case.

After that, I was able to update OpenSSL — but the problem was getting a new set of SSL certificates for all the servers. You probably don’t remember my other postmortem, but when my blog’s certificate expired, I was also in the USA, and I had no access to my StartSSL credentials, as they were only on the other laptop. The good news was that I had the same laptop I used that time with me, and I was able to log in and generate new certificates. While at it, I replaced the per-host SNI ones with a wildcard one.

The problem was with the xine certificate: the Class 2 certificate was already issued with the previous user, which I had no access to still (because I never thought I would have needed it), which meant I could not request a revocation of the certificate. Not only StartSSL were able to revoke it for me anyway, but they also did so free of charge (again, kudos to them!).

What is the takeaway from all of this? Well, for sure I need a backup build host for urgent package rebuilds; I think I may rent a more powerful vserver for that somewhere. Also I need a better way to handle my StartSSL credentials: I had my Zenbook with me only because I planned to do the datacenter work. I think I’ll order one of their USB smartcard tokens and use that instead.

I also ordered another a USB SIM-sized card reader, to use with a new OpenPGP card, so expect me advertising a second GPG key (and if I remember this time, I’ll print some business card with the fingerprints). This should make it easier for me to access my servers even if I don’t have a trusted computer with me.

Finally, I need to set up PFS but to do that I need to update Apache to version 2.4 and last time that was a problem with mod_perl. With a bit of luck I can make sure it works now and I can update. There is also the Bugzilla on xine-project that needs to be updated to version 4, hopefully I can do that tonight or tomorrow.

What happened to my SSL certificates? A personal postmortem

I know that for most people this is not going to be very interesting, but my current job is teaching me that it’s always a good idea to help people learn from your own mistakes; especially so if you let others comment on said mistakes to see what you could have done better. So here it goes.

Let’s start to say that I’m an idiot. Last month I was clever enough to update the certificate for xine-project which was almost to expire. Unfortunately, I wasn’t so clever as to notice that the rest of my certificates were going to expire give or take at the same time. Nor I went remembering that my StartSSL verification was expiring, as last year I was in the US when that happened, and I had some trouble as my usual Italian phone number was unavailable. I actually got a notification that my certificate was expiring already when I was in London, last week. I promised myself to act on it as soon as I would get home to Dublin, but of course I ended up forgetting about it.

And then this morning came, when I got notified via Twitter that my blog’s certificate expired. And then the panic. I’m not in Dublin; I’m not in Ireland, I’m not in Europe even. I’m in Washington, DC at LISA ‘13, without either my Italian or US phone number, without my client certificate, which was restricted to my Dell laptop which is sitting in my living room in Dublin, and of course, no longer living in Italy!

Thankfully, the StartSSL support are great guys, and while they couldn’t verify me for a Class 2 as I was before right away, I got at least further enough to be able to get new Class 1 certificates, and start the process for Class 2 re-verification. Unfortunately, Class 1 means that I can’t have multiple hostnames for the cert, or even wildcard certificates. So I decided to bit the bullet and go with SNI certificates, which basically means that each vhost now has its own certificate. Which is fine, just a bit more convoluted to set up, as I had to create a number of Certificate Signature Request (CSR) as letting StartSSL generate the keys as 4096 bit SHA-256 RSA takes a very long time.

Unfortunately, SNI means that there are a few people who won’t be able to access my blog any more, although most of them were already disallowed from commenting thanks to my ModSecurity Ruleset as they would be Windows XP with Internet Explorer (any version, my ruleset would only stop IE6 from commenting). There probably are some issues for people stuck with Android 2 and the default browser. I’m sorry for you guys, I think Opera Mobile would work fine for it, but feel free to scream at me that being the case.

Unfortunately, there seems to be trouble with Firefox and with Safari at this point: both these browsers enabled OCSP by default quite a while ago, but newly minted certificates from StartSSL will fail the OCSP check for a few hours. Also there seems to be an issue with Firefox on Android, where SNI is not supported, or maybe it’s just the same OCSP problem which leads to a different error message, I’m not sure. Chrome, Safari on iOS and Opera all work fine.

What still needs to be found out is whether Planet Gentoo and NewsBlur will handle this properly. I’m not sure yet but I’m sure I’ll find out pretty soon. Some offline RSS readers could also not support SNI — that being the case, rather than just complaining to me, let upstream know that they are broken, I’m sure somebody is going to have a good fun with that.

Before somebody points out I should have alerts about certificate expiration, yes I know. I used to have these set up on the Icinga instance that was used by my previous employer, but ever since I haven’t set up anything new for that. I’m starting to do so as we speak, by building Icinga for my Puppetmaster host. I’m also going to write on my calendar to make sure to update the certificates before they expires, as for the OCSP problem noted above.

Questions and comments are definitely welcome, suggestions on how to make things better are too, and if you use Flattr remember to use your email address, as good suggestions will be rewarded!

Hunting for a SSL certificate

So, in the previous chapter of my personal current odyssey I noted that I was looking into SSL certificates; last time I wrote something about it I was looking into using CACert to provide a few certificates around. But CACert has one nasty issue for me: not only it’s not supported out of the box by any browser, but also I have failed up to now to find a way to get Chromium (my browser of choice) to accept it, which doesn’t make it better than the self-signed certificates for most of my aims.

Now, back at that time, Owen suggested me to look into StartSSL which is supported out of the box by most if not all the drivers out there, and supports free Class 1 certificates. Unfortunately Class 1 certificates don’t allow for SNI or wildcard certificates, which I would have liked to have, as I have a number of vhosts on this server. On the other hand, the Class 2 (which does provide that kind of information) has an affordable price ($50), so I wouldn’t have minded confirming my personal details to achieve that. The problem is that to get the validation, I need to send a scan of two IDs with a photo, and I only got one. I guess I’ll finally have to get a passport.

As a positive note for them, StartSSL actually replied to my tweet-rant suggesting I could use my birth certificate as secondary ID for validation. I guess this is easier to procure in the United States – at least judging from the kind of reverence Americans have of them – here I’d sincerely like to not bother going to look for it, especially because, as it is, my birth certificate does not report my full name directly (I legally changed it a few years ago if you remember), but as an amendment.

There are, though, a few other problems that shown up while using StartSSL; the first problem is that it doesn’t allow you to use Chrome (or Chromium) to handle registration because of troubles with client-side certificates. Another problem is that the verification for domain access is not based on the DNS hosting, but just on mail addresses: you verify the domain foo by receiving an email directed to webmaster@foo (or other email addresses, both standard and taken from the domain’s WhoIs record). While it’s relatively secure, it only works if the domain can receive email, and only seem to work to verify second level domains.

Using the kind of verification that Google uses to verify domains would make it much nicer to verify domain ownership, and works with subdomains as well as domains that lack email entirely. For those who don’t know how the Google domain verification works, they provide you with the name of a CNAME you have to add to your domain and point it to “google.com”; since the CNAME they tell you to set up is created with a hash of your account name and the domain itself, they can ensure that you have access to the domain configuration and thus to the domain itself. I guess the problem here is just that it takes much more time for DNS to propagate than it takes an email to arrive, and have a fast way to create a new certificate is definitely a good thing of StartSSL.

At any rate, I got a couple of certificates this way, so I finally don’t get Chrome’s warnings because of invalid certificates when I access this computer’s Transmission web interface (which I secure through an Apache reverse proxy). And I also took the time to finally secure xine’s Bugzilla with an SSL connection and certificate.

Thanks Owen, thanks StartSSL!