Network Security Services (NSS) and PKCS#11

Let’s clear first a big mess. In this post I’m going to talk about dev-libs/nss or, as the title suggests, Network Security Services which is the framework developed by Netscape first, and Mozilla Project now, for implementing a number of security layers, including (especially) SSL. This should not be confused with many others similar acronym, especially with the Name Service Switch which is the interface that allows your applications to resolve hosts and users against database they aren’t designed to use in the first place.

In my previous posts about smartcard-related software components – first and second – I started posting an UML components diagram that was not very detailed but generally readable. With time, and with the need to clarify my own understanding of the whole situation, the diagram is getting more complex, more detailed, but arguably less readable.

In the current iteration of the diagram, a number of software projects are exploded in multiple components, like I originally did with the lone OpenCryptoki project (which I should have been writing about but I hadn’t had enough time to finish cleaning off yet). In particular, I split the NSS component in two sub-components: libnss3 (which provides the actual API for the applications to use), and libnssckbi that provides access to the underlying NSS database. This is important because it shows how the NSS framework actually communicates with itself through the use of the standard PKCS#11 interface.

Anyway, back to NSS proper. To handle multiple PKCS#11 providers – which is what you want to do if you intend to use a hardware token, or a virtual one for testing – you need to register them with NSS itself. If you’re a Firefox user, you can do that from its settings windows, but if you’re a Chromium user, you’re mostly out of luck for what concerns GUI: the official way to deal with certificates et simila with Chromium is to use the NSS command-line utilities available with the utils USE flag for dev-libs/nss.

First of all, by default Mozilla, Evolution and Chromium, and the command-line utilities use three different paths to find their database: one depending on the Mozilla profile, ~/.pki/nssdb and .netscape respectively. Even more importantly, by default the first and last will use an “old” version of the db, based on the Berkeley DB interface, while the other two will use a more modern, SQLite-based database. This is troublesome.

Thankfully, the Mozilla Wiki has an article on setting up a shared database for NSS which you might want to do to make sure that you use the same set of certificates between Firefox, Chromium, Evolution and the command-line utilities. What it comes to be is just a bunch of symlinks. Read the article yourself for the instructions; on the other hand I have to note you to do this as well:

~ % ln -s .pki/nssdb .netscape

This way the nss utilities will use the correct database as well. Remember that you have to logout and log back in to tell the utilities and Firefox to use the SQL database.

Unfortunately I haven’t been able to get a token to work in this environment; from one side I’m afraid I might have busted the one Eva sent me (sigh! but at least it served the purpose of getting most of this running); from the other, Scute does not allow to upload an arbitrary certificate, but only to generate a CSR, which I obviously can’t get signed by StartSSL (which is my current certificate provider). Since I’m getting paranoid about security (even more so since I’ll probably be leaving my servers in an office when I’m not around), I’ll probably be buying an Aladdin token from StartSSL though (which also means I’ll be testing out their middleware). At that point I’ll give you more details about the whole thing.

Too many alternatives are not always so good

I can be quite difficult to read for what concerns alternative approaches to the same problem; while I find software diversity to be an integral part of the Free Software ideal and very helpful to find the best approach to various situations, I also am not keen on maintaining the same code many time because of that, and I’d rather have projects to share the same code to do the same task. This is why I think using FFmpeg for almost all the multimedia projects in the Free Software world is a perfectly good thing.

Yesterday, while trying to debug (with the irreplaceable help of Jürgen) a problem I was having with Gwibber (which turned out to be an out-of-date ca-certificates tree), I noted one strange thing with pycurl, related to this fact, that proves my point to a point.

CURL can make use of SSL/TLS encryption using one out of three possible libraries: OpenSSL, GnuTLS and Mozilla NSS. The first option is usually avoided by binary distributions because it is incompatible with some licensing terms; the third option is required for instance by the Thunderbird binary package in Gentoo as it is. By default Gentoo uses OpenSSL, that you like it or not.

When CURL is built against OpenSSL (USE="ssl -gnutls -nss"), PyCURL linked to libcrypto; given that my system is built with forced --as-needed, it also means it uses it. I found it quite strange so I went to look at it; if you rebuild CURL (and then PyCURL) with GnuTLS (USE="ssl gnutls -nss") you’ll see that it only links to libgnutls, but if you look closer, it’s using at least one libgcrypt symbol. Finally if you build it with Mozilla NSS (USE="ssl -gnutls nss") then it will warn that it didn’t detect the SSL library used.

The problem here is that CURL seems not to provide a total abstraction of the SSL implementation it uses, and for proper threading support, PyCURL needs to run special code for the crypto-support library (libcrypto for OpenSSL; libgcrypt for GnuTLS). I’m sincerely not sure how big the problem would be when you mix and match the CURL and PyCURL implementations, I also have no idea what would happen if you were to use CURL with NSS and PyCURL with that (which will not provide locking for crypto at all). What I can tell you, is that if you change the SSL provider in CURL, you’d better rebuild PyCURL, to be on the safe side. And there is currently no way to let Portage do that automatically for you.

And if you are using CURL with NSS and you see Portage asking you to disable it in favour of GnuTLS or OpenSSL, you’ll know why: PyCURL is likely to be your answer. At least once the bug will be addressed.

Ebuilds have to be done right

There is quite some stir right now in the gentoo-dev mailing list following a mass-masking and for removal of packages for QA and security reasons; I think that Alec nailed down most of the issues with his comments:

> This thread is yet another proof that we need to introduce a “Upcoming
> masking” for unmaintained packages.

<sarcasm>

Shall I file those forms in triplicate and fax them to the main office sir?

</sarcasm>

Since amazingly I actually started the Treecleaners project; the
intent was actually to fix problems with packages. Part of the
problem is that there are hundreds of packages in the tree and the
fixes vary in complexity so it is difficult to create hard-and-fast
rules on when to keep a package versus when to toss it. One of the
things I like about masking is that it quickly gets people who
actually care about the package up to bat to fix it instead of leaving
it broken for months. I realize maintainers do not exactly enjoy this
kind of poking, however when things have been left for long enough I
believe our options become a bit more limited (in this case, masking
for removal due to unfixed sec bugs.)

Now, this is one issue I already partly addressed in my post about the five minutes fix myth but I’d like to remind again that even though we can easily spot some blatant problems with packages, having a package that compiles and that passes the obvious, programmatic QA checks does not really tell you much about the health status of the package; indeed, you won’t know whether the package works at all for the final users. Tying to another post of mine (incidentally, someone complained about my self-reference to posts… should I stop giving pointers and context?), I have to admit that sometimes it’s impossible to have a 100% coverage of packages, among other reasons because some packages need particular hardware, or particular software components set up, to be able to test them effectively. On the other hand, when such a complex setup isn’t strictly needed, we should expect some level of testing when making changes, minor or otherwise.

Sometimes, the mistakes are in the messages logged by the ebuild, at other times, the problem is that some important part of the package is missing, for example because the install phase is manually written in the ebuild, and upstream has added some extra utility that is installed by make install but is obviously ignored by the ebuild (and this actually is one of the points that Donnie brought up when I suggested to override upstream build systems with an eclass: we’d have to triple-check the new releases to make sure that no further source files or objects or libraries were added from the previously-packaged version). All these things are almost impossible to identify in a nice, programmatic scripted way, and need knowledge of a package, checking the release notes having an idea how to test the package.

For instance, I’ve been looking into sys-libs/libnss-pgsql today, as I have an interest on it; the ebuild installs the shared library manually (skipping libtool’s relinking phase, by the way); why did it do that? It takes four steps rather than the one needed for make install… well, the reason was obvious (but not commented upon!) after changing it to use make install: a post-install check actually aborted the merge: the problem was that the package installed the Name Service Switch library in /lib, but also installed the static archive and the libtool .la file, both of which are definitely not needed in /lib. The handwritten install solution solves the symptoms but not the following problems:

  • it will still build the static archive (non-PIC) version, causing twice the number of compiler calls;
  • it won’t tell upstream that they forgot one thing in their Makefile.am;
  • it’s still wrong because the libraries it links to are not available in /lib: it won’t be working before mounting /usr if /usr is on a different partition (who does still do that, nowadays?!) — it should be in /usr itself, at this point (and yes, you can do that: both GNU libc and FreeBSD – which has a different NSS interface by the way – check both /usr and /usr/lib).

Incidentally, why does glibc’s default nsswitch.con use db files for services, protocols, svc and ethers? Their presence in there means that each time you call into glibc to resolve a port name, it makes eight open() syscalls trying to find the file. It doesn’t sound too right.

I have patches, and I have a new ebuild, I’ll see to send them upstream and get it committed (by someone else, or by picking maintainership for it) in the next day or so. In the mean time I have to get back to my work.