If you follow my blog since I started writing, you might remember my post about imported libraries from last January and the follow up related to OpenOffice; you might know I did start some major work toward identifying imported libraries using my collision detection script and that I postponed till I had enough horsepower to run the script again.
And this is another reason why I’m working on installing as many packages as possible on my testing chroot. Now, of course the primary reason was to test for --as-needed
support, but I’ve also been busy checking build with glibc 2.8, and GCC 4.3, and recently glibc 2.9 . And in addition to this, the build is also providing me with some data about imported libraries.
With this simple piece of script, I’m doing a very rough cut analysis of the software that gets installed, to check for the most commonly imported libraries: zlib, expat, bz2lib, libpng, jpeg, and FFmpeg:
rm -f "${T}"/flameeyes-scanelf-bundled.log
for symbol in adler32 BZ2_decompress jpeg_mem_init XML_Parse avcodec_init png_get_libpng_ver; do
scanelf -qRs +$symbol "${D}" >> "${T}"/flameeyes-scanelf-bundled.log
done
if [[ -s "${T}"/flameeyes-scanelf-bundled.log ]]; then
ewarn "Flameeyes QA Warning! Possibly bundled libraries"
cat "${T}"/flameeyes-scanelf-bundled.log
fi
This checks for some symbols that are usually not present without the rest of the library, and although it gives a few false positives, it does produce interesting results. For instance while I knew FFmpeg is very often imported, and I expected zlib to be copied in every other software, it’s interesting to know that expat as much used as zlib, and every time it’s imported rather than used from the system. This goes for both Free and Open Source Software and for proprietary closed-source software. The difference is that while you can fix the F/OSS software, you cannot fix the proprietary software.
What is the problem with imported libraries? The basic one is that they waste space and memory since they duplicate code already present in the system, but there is also one other issue: they create situations where even old, known, and widely fixed issue remain around for months, even years after they were disclosed. What preserved proprietary software this well to this point is mostly related to the so-called security through obscurity. You usually don’t know that the code is there and you don’t know in which codepath it’s used, which makes it much harder for novices to identify how to exploit those vulnerabilities. Unfortunately, this is far from being a true form of security.
Most people would now wonder, how can they mask the use of particular code? The first option is to build the library inside the software, which hides it to the eyes of the most naïve researchers; by not loading explicitly the library it’s not possible to identify its use through the loading of the library itself. But of course the references to those libraries remain in the code, and indeed most of the times you’ll find the libraries’ symbols as defined inside executables and libraries of proprietary software. Which is exactly what my rough script checks. I could use pfunct
from the seven dwarves to get the data out of DWARF debugging information, but proprietary software is obviously built without debug information so it would just waste my time. If they used hidden visibility, finding out the bundled libraries would be much much harder.
Of course, finding which version of a library is bundled in an open source software package is trivial, since you just have to look for the headers to find the one defining the version — although expat often is stripped of the expat.h
header that contains that information. On proprietary software is quite more difficult.
For this reason I produced a set of three utilities that, given a shared object, find out the version of the bundled library. As it is it quite obviously doesn’t work on final executables, but it’s a start at least. Running these tools on a series of proprietary software packages that bundled the libraries caused me some kind of hysteria: lots and lots of software still uses very old zlib versions, as well as libpng versions. The current status is worrisome.
Now, can somebody really trust proprietary software at this point? The only way I can trust Free Software is by making sure I can fix it, but there are so many forks and copies and bundles and morphings that evaluating the security of the software is difficult even there; on proprietary software, where you cannot be really sure at all about the origin of the software, the embedded libraries, and stuff like that, there’s no way I can trust that.
I think I’ll try my best to improve the situation of Free Software even when it comes to security; as the IE bug demonstrated, free software solutions like Firefox can be considered working secure alternatives even by media, we should try to play that card much more often.