Looking for symbols? elfgrep to the rescue!

About three years after starting my work on Ruby-Elf I finally implemented one of the scripts I wanted to write for the longest tile: elfgrep. At the name implies it’s a tool with a grep-like interface to look up symbols defined and used in ELF files.

I have avoided writing it for a long time because scanelf (part of pax-utils) implements already a similar, but definitely not identical, feature through the -gs options. The main feature missing in scanelf is the ability to look for multiple symbols at once: it does allow you to specify multiple symbols, but then again it only prints the first one found, rather than all of them.

The other night, mru from FFmpeg suggested me another limitation of scanelf: it cannot be used to look for symbols depending on their version information (for GNU systems). So I finally decided to start writing my own. Thankfully, Ruby-Elf was designed to be easy to extend, if anything, so the original implementation to do the job it was aimed for only required 83 lines of Ruby code, including my license header.

Right now the implementation is a bit more complex, and so it has more lines of code, but it implements a number of switches analogue to those in grep itself, that makes it a very flexible tool to find both definitions and uses of symbols: you can either look for the library defining a given symbol or the objects making use of those; you can get the type of symbols (it has an output similar to nm(1)), or you can simply list the files that matched or that didn’t match. You can also count symbols, without having to go through wc -l thanks to the -c option, and the list output is suitable to use with xargs -0 as well.

Most of the time, when analysing the output of a library, I end up having to do something like nm | grep; this unfortunately doesn’t work that well when you have multiple files, as you lose sight of the file that actually hits; elfgrep solves this just fine as it prefixes the file’s path to the nm-like output, which makes it terrific to identify which object file exports a given symbol, for instance.

All in all, I’m very very happy at how elfgrep turned out to be, so I’ll likely try to make a release of ruby-elf soonish; but to do so I have to make it a Ruby Gem, just for the sake of ease of distribution; I’ll look at it in the next week or so. In the mean time you can find the sources on the project’s page and on my overlay you find an ebuild that installs it from Git until I make a release (I’ll package it in main tree as soon as it is!).

If you have any particular comment, patch, request, or anything like that, feel free to send me an email, you find the references above.

Hide those symbols!

Last week I have written in passing about my old linking-collision script. Since then I restarted working on it and I have a few comments to give again.

First of all, you might have seen the 2008 GhostScript bug — this is a funny story; back in 2008 when I started working on finding and killing symbol collisions between libraries and programs, I filed a bug with GhostScript (the AFPL version), since it exported a symbol that was present, with the same name, in libXfont and libt1. I found that particularly critical since they aren’t libraries used in totally different applications, as they are all related to rendering data.

At the time, the upstream GS developer (who happens to be one of the Xiph developers, don’t I just love those guys?) asked me to provide him with a real-world crash. Since any synthetic testcase I could come up with would look contrived, I didn’t really want to spend time trying to come up with one. Instead I argued the semantic of the problem, explaining why, albeit theoretical at that point, the problem should have been solved. No avail, the bug was closed at the time with a threat that anyone reopening it would have its account removed.

Turns out in 2011 that there is a program that does link together both libgs and libt1: Evince. And it crashes when you try to render a DVI document (through libt1), containing an Encapsuled PostScript (EPS) image (rendered through GhostScript library). What a surprise! Even though the problem was known and one upstream developer (Henry Stiles) knows that the proper fix is using unique names for internal functions and data entries, the “solution” was limited to the one colliding symbol, leaving all the others to be found in the future to have problems. Oh well.

Interestingly, most packages don’t seem to care about their internal symbols, be them libraries or final binaries. On final binaries this is usually not much of a problem, as two binaries cannot collide with one another, but it doesn’t mean that the symbol couldn’t collide with another library — for this reason, the script now ignores symbols that collide only between executables, but keeps listing those colliding with at least one library.

Before moving on to how to hide those symbols, I’d like to point out that the Ruby-Elf project page has a Flattr button, while the sources are on Gitorious GitHub for those who are curious.

Update (2017-04-22): as you may know, Gitorious was acquired by GitLab in 2015 and turned down the service. So the project is now on GitHub. I also stopped using Flattr a long time ago.

You can now wonder how to hide the symbols; one way that I often suggest is to use the GCC-provided -fvisibility=hidden support — this is obviously not always an option as you might want to support older versions, or simply don’t want to start adding visibility markers to your library. Thankfully there are two other options you can make use of; one is to directly use the version script support from GNU ld (compatible with Sun’s, Apple’s and gold for what it’s worth); basically you can then declare something like:

{
  global:
    func1;
    func2;
    func3;
  local: *;
}

This way only the three named functions would be exported, and everything else will be hidden. While this option works quite nicely, it often sounds too cumbersome, mostly because version scripts are designed to allow setting multiple versions to the symbols as well. But that’s not the only option, at least if you’re using libtool.

In that case there are, once again, two separate options: one is to provide it with a list of exported symbols, similar to the one above, but with one-symbol-per-line (-export-symbols SYMBOL-FILE), the other is to provide a regular expression of symbols to export (-export-symbols-regex REGEX), so that you just have to name the symbols correctly to have them exported or not. This loses the advantage of multiple versions for symbols – but even that is a bit hairy so I won’t get there – but gains the advantage of working with generating Windows libraries as well, where you have to list the symbols to export.

I’d have to add here that hiding symbols for executables should also reduce their startup time, as the runtime loader (ld.so) doesn’t need to look up a long list of symbols when preparing the binary to be executed; the same goes for libraries. So in a utopia world where each library and program only exports its tiny, required list of symbols, the system should also be snappier. Think about it.

Another good reason to use 64-bit installations: Large File Support headaches

A couple of months ago I wrote about why I made my router a 64-bit install listing a series of reasons why 64-bit hardened systems are safer to manage than 32-bit ones, mostly because of the feature set of the CPUs themselves. What I didn’t write about that time, though, is the fact that 64-bit installs also don’t require to deal with the curse of large file support (LFS).

It was over two years ago I last wrote about this and at the time my motivation was mostly drained by a widely known troll, insisting that I got my explanation wrong. Just for the sake of not wanting to repeat the same pantomime, I’d like to thank Lars for actually getting me a copy of Advanced Programming in the Unix Environment so that I can actually reference said troll with the pages where the diagrams he referred to are: 106 to 108. And there is nothing there to corroborate his views against mine.

But now, let’s take a few steps back and let’s look at what I’m talking about altogether.

What is the large file support? It is a set of interfaces designed to work around the limits imposed by the original design of the POSIX API for file support on 32-bit systems. The original implementations of functions like open(), stat(), fseeko() and so on was designed using 32-bit data types, either signed or unsigned depending on the use case. This has the unfortunate effect of limiting a number of attributes to that boundary; the most obvious problem is the size of the files themselves: you cannot use open() to open a descriptor to a file that is bigger than 2GB, as then the offsets would overflow. The inability to process files bigger than 2GB by some of your software isn’t, though, that much of a problem – after all, not all software can work with such files within reasonable resource constrain – but that’s not the worst problem you have to consider.

Because of this limit on file size, the new set of interfaces has been always called “large file”, but the name itself is a bit of a misnomer; this new set of interfaces, with extended 64-bit parameters and data fields, is required for operating on large file systems as well. I might not have expressed it in the most comprehensible of terms two years ago, so let’s here it from scratch again.

In a filesystem, the files’ data and meta-data is tied to structures called inodes; each inode has an individual number; this number is listed within the content of a directory to link that to the files it contains. The number of files that can be created on a filesystem is limited by the number of unique inode numbers that the filesystem is able to cope with — you need at least one inode per file; you can check the status with df -i. This amount is in turn tied both to the size of the datafield itself, and to the data structure used to look up the location of the inode over the filesystem. Because of this, the ext3 filesystem does not even reach the 32-bit size limit. On the other hand, both XFS and ext4, using more modern data structures, can reach that limit just fine… and they are actually designed to overcome it altogether.

Now, the fact that they are designed to support a 64-bit inode number field does not mean that they’ll always will; for what it’s worth, XFS is designed to support block sizes over 4KiB, up to 64KiB, but the Linux kernel does not support that feature. On the other hand, as I said, the support is there to be used in the future. Unfortunately this cannot be feasibly done until we know for sure that the userland software will work with such a filesystem. It is one thing to be unable to open a huge file, it is another to not being able to interact in any way with files within a huge filesystem. Which is why both Eric and me in the previous post focused first off on testing what software was still using the old stat() calls with the data structure with a 32-bit inode number field. It’s not about the single file size, it’s a matter of huge filesystem support.

Now, let’s wander back to why I wanted to go back at this topic. With my current line of work I discovered at least one package in Gentoo (bsdiff) that was supposed to have LFS support, but didn’t because of a simple mistake (append-lfs-flags acts on CPPFLAGS but that variable wasn’t used in the build at all). I thought a bit about it, and there are so many ways to sneak in a mistake that would cause a package to lose LFS support even if it was added at first. For instance for a package based on autotools, using AC_SYS_LARGEFILE to look for the proper largefile support, is easy to forget including config.h before any other system library header, and when that happens, the largefile support is lost.

To make it easier to identify packages that might have problems, I’ve decided to implement a tool for this in my Ruby-Elf project called verify-lfs.rb which checks for the presence of non-LFS symbols, as well as a mix of both LFS and non-LFS interfaces. The code is available on Gitorious, although I have yet to write a man page, and I have to add a recursive scan option as well.

Finally, as the title suggest, if you are using a 64-bit Linux system you don’t have to even think about this at all: modern 64-bit architectures define the original ABI as 64-bit already, making all the largefile support headaches irrelevant. The same goes for FreeBSD as well, as they implemented the LFS interface as their only interface with version 5, avoiding the whole mess of conditionality.

I’m seriously scared of what I could see if I were to run my script over the (32-bit) tinderbox. Sigh.

C++ name demangling

I’ve been having some off time (well, mostly time I needed to do something that kept me away from facebooker craziness since that was driving me crazy quite literally), and I decided to get more work to do in Ruby-Elf (which now has its own page on my site — and a Flattr button as well, like this blog and Autotools Mythbuster thanks to Sebastian.)

What I’m working on right now is supporting C++ name demangling, in pure Ruby; the reason for that is that I wanted to try something “easier” before moving on to trying to implement the full DWARF specification in Ruby-Elf. And my reason to wishing for a DWARF parser in there is that Måns confirmed my suspicion that it is possible to statically-analyze the size of the stack used by a function (well, with some limitations, but let’s not dig into that right now). At any rate I decided to take a stab at that because it would come useful for the other tools in Ruby-Elf.

Now, while I could assume that most of my readers already know what a demangler is (and thus what is mangling), I’ll see to introduce it for all the others who would otherwise end up bored by my writing. C++ provides a much harder challenge for linkers and loaders, because the symbols are no longer identified just by their “simple” name; C++ symbols have to encode namespaces and class levels, a number of special functions (the operators), and in the case of functions and operators, the list of parameters (because two functions with the same name and different set of parameters, in C++, are valid). To do so, all compilers implement some kind of scheme to translate the symbols into identifiers that use a very limited subset of ASCII characters.

Different compilers use different schemes to do so; sometimes even differing depending on the operating system they are building on (mostly that’s for compatibility with other “native” compilers on that platform). The scheme that you can commonly find on Linux binaries is the so-called GCC3-mangling that is used, as the name leaves to intend, by GCC 3 and later and by ICC on Linux (and iirc OSX). The complete definition of the mangling scheme is available as part of the Itanium ABI definition (thanks to Luca who found me the link to that document); you can recognize the symbols mangled under this algorithm by their starting with _Z. Other mangling schemes are described among other documentation — link provided by Dark Shikari.

While standardising over the same mangling algorithm has been proposed many times, compilers make use of the difference in mangling scheme to prevent risk of cross-ABI linking. And don’t expect that the compilers will standardise their ABI anytime soon.

At any rate, the algorithm itself looks easy at the first glance; most of the names are encoded by having the length of the name encoded in ASCII and then the name itself; so an object bar in a namespace foo would be encoded as _ZN3foo3barE (N-E delimit the global object name). Unfortunately when you add more details in, the complexity increases, a lot. To avoid repeating the namespace specification time after time, when dealing with objects in the same namespace, or repeating the full type name when accepting objects of its own class, or multiple parameters with the same type, for instance, the algorithm supports “backreferences” (registers); name fragments and typenames (but not function names) are saved into these registers and then recovered with specific character sequences.

Luckily, Ragel helps a lot to parse this kind of specifications; unfortunately I have reached a point where proceeding further require definitely a lot more work than I’d have expected. The problem is, you have recursion within the state machines: a parameter list might contain.. another parameter list (for function pointers); a type might contain a typelist, as part of a template… and doing this in Ragel is far from easy.

There are also shorthands used to replace common name fragments, such as std:: or std::allocator that would be used so many times that the symbol names would be exceedingly huge. All in all, it’s a quite complex operation. And not a perfect one either; different symbols can demangle to the same string; and not even the c++filt shipping with binutils take care of validating the symbol names it finds, so for instance _Zdl is translated to “operator delete” even though there is nothing like that in C++: you need a parameter for that operator to work as intended. I wonder if this can be used to obscure proprietary libraries interfaces…

Now, since Luca also asked about this, I’d have to add, as Måns already confirmed, that having too-long symbol names can slow down the startup of a program; in particular it increases the time needed for bindings; even though generally-speaking the loader will not compare the symbol name itself, but rather its hash on the hash table, the longer the symbol the more work the hash function has to do. And to give an idea of how long a name we’re talking about, take for instance the following real symbol; exported symbol, nonetheless, coming from gnash:

_ZN5gnash13iterator_findERN5boost11multi_index21multi_index_containerINS_8PropertyENS1_10indexed_byINS1_14ordered_uniqueINS1_13const_mem_funIS3_RKNS_9ObjectURIEXadL_ZNKS3_3uriEvEEEEN4mpl_2naESC_EENS5_INS1_3tagINS_12PropertyList8OrderTagESC_SC_SC_SC_SC_SC_SC_SC_SC_SC_SC_SC_SC_SC_SC_SC_SC_SC_SC_EENS6_IS3_iXadL_ZNKS3_8getOrderEvEEEESC_EESC_SC_SC_SC_SC_SC_SC_SC_SC_SC_SC_SC_SC_SC_SC_SC_SC_SC_EESaIS3_EEEi

gnash::iterator_find(boost::multi_index::multi_index_container, mpl_::na, mpl_::na>, boost::multi_index::ordered_unique, boost::multi_index::const_mem_fun, mpl_::na>, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, std::allocator >&, int)

The second line is filtered through the binutils demangler to provide the decorated name of the function for C++. If you cannot guess, one of the problems is that the mangling algorithm contains no shorthand for boost templates, contrarily to what it has for the standard library templates. I know I cannot pretend that this relates directly with the problems I see with C, but it shows two things: the first is that C++ has an unseen amount of complexity that even long-time developers fail to catch properly; the second is that ELF itself doesn’t seem well-designed to handle the way C++ is compiled into code; and this is all without even looking at the recent undocumented changes introduced in GLIBC to make sure that C++ is implemented correctly.

I wonder if I’ll ever be able to complete this demangler, and then start with the other schemes supported in ELF files…

Shared libraries worth their while

This is, strictly speaking, a non Gentoo-related post; on the other hand, I’m going to introduce here a few concepts that I’ll use in a future post to explain one Gentoo-specific warning, so I’ll consider this a prerequisite. Sorry if you feel like Planet Gentoo should never go over the technical non-Gentoo work, but then again, you’re free to ignore me.

I have, in the past, written about the need to handle shared code in packages that install multiple binaries (real binaries, not scripts!) to perform various tasks, which end up sharing most of their code. Doing the naïve thing, using the source code in all of them, or the slightly less naïve thing, building a static library and linking it to all the binaries, tend to increase the size of the commands on disk, and the memory required to fully load them in memory. In my previous post I noted a particularly nasty problem with the smbpasswd binary, that was almost twice the size because of unused code injected by the static convenience library (and probably even more, given that I never went down for to hide the symbols and clean them up).

In another post, I also proposed the use of multicall binaries to handle these situations; the idea behind multicall binaries is that you end up with a single program, with multiple “applets”; all the code is merged into a single ELF binary object, and at runtime the correct path is taken to call the right applet, depending on the name used to call up the binary. It’s not extremely easy but not even impossible to get right, so I still suggest that as main alternative to handle shared code, when the shared code is bigger in size than the single applet’s code.

This does not solve the Samba situation though: the code of the single utilities is still big enough than having a single super-package will not make it very manageable, and a different solution has to be devised. In this case you end up having to choose between the static linking (naïve approach) or using a private, shared object. An easy way out here is trying to be sophisticated, and always go with the shared object approach; it definitely might not be the best option.

Let me be clear here: shared objects are not a panacea to the shared code problems. As you might have heard already, using shared objects is generally a compromise: you relax problems related to bugs and security vulnerability, by using a shared object, so that you don’t have to rebuild all the software using that code — and most of the times you also share read-only memory to reduce the memory consumption of a system — at the expense of load time (the loader has to do much more work), sometime execution speed (PIC takes its toll), and sometimes memory usage, as counter-intuitive as that might sound, given that I just said that they reduce memory consumption.

While the load time and execution speed tolls are pretty much immediate to understand, and you can find a lot of documentation about them on the net, it’s less obvious to understand the share-memory, waste-memory situation. I wrote extensively about the Copy-on-Write problem so if you follow my blog regularly you might have guessed the problem already at this point, but it does not fill in all the gaps yet, so let me try to explain how this compromise works.

When we use ELF objects, part of the binary file itself are shared in memory across different processes (homogeneous or heterogeneous). This means that only those parts that would not be modified from the ELF files can be shared. This usually includes the executable code – text – for standard executables (and most code compiled with PIC support for shared objects, which is what we’re going to assume), and part (most) of the read-only data. In all cases, what breaks the share for us is Copy-on-Write, as that will create private copies of the pages to the single process, which is why writeable data is nothing we care about when choosing the code-sharing strategy (it’ll mostly be the same whether you link it statically or via shared objects — there are corner cases, but I won’t dig into them right now).

What is that talking about homogeneous or heterogeneous processes above? Well, it’s a mistake to think that the only memory that is shared in the system is due to shared objects: read-only text and data for an ELF executable file are shared among processes spawned from the same file (what I called and will call homogeneous processes). What shared object accomplish with memory is sharing between processes spawned by different executables, but loading the same shared objects (heterogeneous processes). The KSM implementation (no it’s not KMS!) in the current versions of the Linux kernel allows for something similar, but it’s a story so long that I won’t really bother count it in.

Again, the first approach to shared objects might make you think that moving whatever amount of memory from being shared between homogeneous processes to be shared between heterogeneous processes is a win-win situation. Unfortunately you have to cope with data relocations (which is a topic I wrote about extensively): a constant pointer is read-only when the code is always loaded at a given address (as it happens with most standard ELF executables), but it’s not when the code can be loaded at an arbitrary address (as it happens with shared objects): in the latter case it’ll end up in the relocated data section, which follows the same destiny as the writeable data section: it’s always private to the single process!

*Note about relocated data: in truth you could ensure that the data relocation is the same among different processes, by using either prelinking (which is not perfect especially with modern software, which is more and more plugin-based), or methods like KDE’s kdeinit preloading. In reality, this is not really something you could, or should, rely upon because it also goes against the strengthening of security applied by Address Space Layout Randomization.*

So when you move shared code from static linking to shared objects, you have to weight in the two factors: how much code will be left untouched by the process, and how much will be relocated? The size utility from either elfutils or binutils will not help you here, as it does not tell you how big the relocated data section is. My ruby-elf instead has an rbelf-size script that gives you the size of .data.rel.ro (another point here: you only care about the increment in size of .data.rel.ro as that’s the one that is added as private: .data.rel would be part of the writeable data anyway). You can see it in action here:

flame@yamato ruby-elf % ruby -I lib tools/rbelf-size.rb /lib/libc.so.6
     exec      data    rodata     relro       bss     total filename
   960241      4507    359020     12992     19104   1355864 /lib/libc.so.6

As you can see from this output, the C library has some 950K of executable code, 350K of read-only data (both will be shared among heterogeneous processes) and just 13K (top) of additional relocated memory, compared to static linking. _Note: the rodata value does not only include .rodata but all the read-only non-executable sections; the value of exec and rodata roughly corresponds of what size calls text).

So how is knowing how much relocated data useful in assessing how to deal with shared code? Well, if you build your shared code as shared object and analyse it with this method (hint: I just implemented rbelf-size -r to reduce the columns to the three types of memory we have in front of us), you’ll have a rough idea of how much gain and how much waste you’ll have for what concern memory: the higher the shared-to-relocated ratio, the better results you’ll have. Having an infinite ratio (when there is no relocated data) it’s the perfection.

Of course the next question is what do you do if you have a low ratio? Well there isn’t really a correct answer here: you might decide to bite the bullet and go in the code to improve the ratio; cowstats from the Ruby-Elf suite helps you to do just that; it can actually help you reducing your private sections as well, as many times you have mistake in there, due to missing const declarations. If you have already done your best to reduce the relocations, then your only chance left is to avoid using a library altogether; if you’re not going to improve your memory usage by using a library, and it’s something internal only, then you really should look into using either static linking or, even better, multicall binaries.

Impootant Notes of Caution

While I’m trying to go further on the topic of shared objects than most documentation I have read myself on the argument, I have to point out that I’m still generalising a lot! While the general concept are as I put them down here, there are some specific situations that change the table making it much more complex: text relocations, position independent executables, PIC overhead, are just some of the problems that might arise while trying to apply these general ideas over specific situations.

Still trying not to dig too deep on the topic right away, I’d like to spend a few words about the PIE problem, which I have already described and noted in the blog: when you use Position Independent Executables (which is usually done to make good use of the ASLR technique), you can discard the whole check of relocated code: almost always you’ll have good results if you use shared objects (minus complications added by the overlinking, of course). You still would have the best results by using multicall binaries if the commands have very little code.

Also, please remember that using shared objects slows down the loading process which means that if you have a number of fire-and-forget commands, which is something not too unusual in the UNIX-like environments, you will probably have best results with multicall, or static linking, than with shared objects. The shared memory is also something that you’ll probably get to ignore in that case, as it’s only worth its while if you normally keep the processes running for a relatively long time (and thus loaded into memory).

Finally, all I said refers to internal libraries used for sharing code among commands of the same package: while most of the same notes about performance-wise up- and down-sides holds true for all kind of shared objects, you have to factor in the security and stability problems when you deal with third-party (or third-party-available) libraries — even if you develop them yourself and ship them with your package: they’ll still be used by many other projects so you’ll have to handle them with much more care, and they should really be shared.

Complex software testing

Yamato is currently ready to start a new tinderbox run, with tests enabled (and test-fail-continue feature so that it does not stop the whole merge when tests do fail); I still have to launch it and I’m still not sure if I should: beside the quite long tests for GCC, which also fail, the glibc tests not only fail but also don’t seem to fail reliably, stopping the ebuild from continuing. I wonder if this is a common trait of tests.

The main issue here is that without tests it makes it very difficult to identify whether the software is behaving as it should or not; as I said, not using gems helped me before and I had plans to test an otherwise not-testable software (although it failed in misery). And because of lack of testing in packages such as dev-ruby/ruby-fcgi, so-called “compatibility patches” get added that don’t really work as they are supposed to.

By having a testsuite you can easily track down issues with concurrency, arch-specific code and other similar classes of problems. Unfortunately, with software that gets complex pretty quickly, and the need for performance overcoming the idea of splitting code in functional units, testing can get pretty ugly.

I currently have two main projects that are in dire need for testing, both failing badly right now; the first is my well-known ruby-elf that, while already having an extensive (maybe too extensive) unit testing suite, lacks some kind of integration testing for the various tools (cowstats, rbelf-size and so on) that can ensure that the results they report are the one that are expected of them. The other project is probably one of the most complex projects I ever worked on: feng .

Testing feng is a very interesting task, since you cannot stop at testing the functional units in it (which, by the way, does not exist: all the code depends one way or another on another piece of it!), you’ve got to test at the protocol level. Now, RTSP is derived out of HTTP, so one could expect that using the methods employed to test HTTP would be good enough.. not the case though: while testing an HTTP server or a web application can be tricky, it’s at least an order of magnitude easier than testing a streaming server. I can write basic tests that ensure the correct behaviour of the server to a series of RTSP requests, but it’d also have to check the RTP data being sent to be correct, and that RTCP is sent and received correctly.

As I said, it’s also pretty difficult to test the software with unit testing, because the various subsystems are not entirely isolated one from the other, so testing the various pieces requires to either fake the presence of the other subsystems, or heavily splitting the code. They rely not only on functions but also on data structure, and the presence of certain, coherent data inside these. Splitting the code is though not always an option because it might get ugly to have good performances out of it, or might require very ugly interfaces to pass the data around.

Between one thing and the other that I have seen lately, I’m really hoping one day to work on some project where extensive testing is a hard requirement, rather than something that I’m doing myself, alone, and is not essential to deliver the code. Sigh.

Movin!!

For a while I have been quoting songs, anime and other media when choosing posts’ titles; then I stopped. Today, it looks perfectly fine to quote the title of one of the Bleach anime endings, by Takacha, since it suits what my post is about… just so you know.

So, since my blog has been experiencing technical difficulties last week, as you might know, I want to move out of the current vserver (midas), which is thankfully sponsored by IOS for xine project to a different server that I can handle just for the blog and a couple more things. I’m now waiting for a few answers (from IOS to start) to see where this blog is going to be deployed next time (I’m looking for Gentoo Linux vservers again).

The main problem is that the big, expensive factor in all this is the traffic; midas is currently serving lots of traffic: this blog alone averages over a 300 MB/day, which gets down to about 10GB of traffic a month. But the big hits come from the git repositories, which means that a relatively easy way to cut down the traffic expense of the server is to move the repositories out.

For this reason I’ve migrated my overlay back over Gentoo hardware (layman included), while Ruby-Elf is the first of my projects to be hosted at gitorious (I’m going to add Autotools Mythbuster soon too).

As for why I decided to go with gitorious over GitHub, it’s a technical and political reason for me. Technical, because I like the interface better; political both for the AGPL3 license used by gitorious and for the fact that it does not highlight the “fork it” method that github seem to have based itself off. On the other hand, I actually had difficulties finding where to clone the unofficial PulseAudio repository to prepare my local copy, and the project interface shows pretty well the “Merge Requests” counter.

At any rate there will be some of my stuff available at github at the end of the day, mostly the things that started or are now maintained within github, like Typo itself (for which I have quite a few changes locally, both bug fixes and behaviour changes, that I’d like to get merged upstream soonish).

This starts to look like a good training for when I’ll actually move out of home too.

Update (2017-04-22): as you may know, Gitorious was acquired by GitLab in 2015 and turned down the service. Which not only means this post is now completely useless, but I gave up and joined the GitHub crowd, since that service “won the war”. Unfortunately some of my content from Gitorious has been lost because I wasn’t good at keeping backups.

Security considerations: scanning for bundled libraries

My fight against bundled libraries might soon transcend the implementation limits of my ruby-elf script .

The script I’ve been using to find the bundled libraries up to now was not designed with that in mind originally; the idea was to identify colliding symbols between different object files, so to identify failure cases like xine’s aac decoder hopefully before they become a nuisance to users like PHP. Unfortunately the amount of data generated by the script due to bundled libraries makes it tremendously difficult to deal with in advance, so it can currently only be used as a post-mortem.

But as a security tool, I already stated it’s not enough because it only checks for symbols that are exported by shared objects (and often mistakenly by executables). To actually go deeper, one would have to look at one of two options: the .symtab entries in the ELF files (which are stripped out before installing), or the data that the compiler emits for each output file in form of DWARF sections with -g flags. The former can be done with the same tools I’ve been using up to now, the latter you list with pfunct from dev-util/dwarves. Trust me, though, that if the current database of optimistically suppressed symbols is difficult to deal with, doing a search using dwarf functions is likely to be unmanageable, at least to handle with the same algorithm that I’m using for the collision detection script.

Being able to identify bundled libraries in that kind of output is going to be tremendously tricky; if my collision detection script already finds collision between executables like the one from MySQL (well, before Jorge’s fixes at least) and Samba packages, because they don’t use internally shared libraries, running it against the internal symbols list is going to be even worse because it would then find equally-named internal functions (usage() anybody), static libraries links (including system support libraries) and so on.

So there are little hopes to tackle the issue this way; which makes the idea of finding beforehand all the bundled libraries in a system an inhuman task; on the other hand that doesn’t mean I have to give up on the idea. We can still make use of that data to do some kind of post-mortem, once again, with some tricks though. When it comes to vulnerabilities, you usually have a function, or a series of function, that are involved; depending on the centrality of the functions in a library, there will be more or less applications using that vulnerable codepath; while it’s not extremely difficult to track them down when the function is a direct API (just look for software having external references to that symbol), it’s quite another story with internal convenience functions, since they are called indirectly. For this reason while some advisories do report the problematic symbols, most of the time the thing is just ignored.

We can, though, use that particular piece of information to track down extra vulnerable software, that bundles the code. I’ve been doing that on request for Robert a couple of times with the data produced by the collision detection scripts, but unfortunately it doesn’t help because it also is only able to check externally-defined API, just like a search for use would. How to solve the problem? Well, I could just not strip the files and just read the data from .symtab to see whether the function is defined, and this might actually be what I’m going to do soonish; unfortunately this creates a couple of issues that needs to be taken care of.

The first is that the debug data is not exactly small, the second is that the chroots volume is under RAID1 so the space is a concern; it’s already 100GB big and with just 10% of it free, if I am not to strip data, it’s going to require even more space; I can probably just split out some of the data of the volume in a chroots-throwable volume that I don’t have to keep on RAID1. If I split the debug data with the splitdebug feature, it would make it quite easy to deal with.

Unfortunately this brings me to the second problem, or rather the second set of problems: ruby-elf does not currently support the debuglink facilities, but that’s easy to implemente, after all it’s just a note section with the name of the debug file, the second is nastier and relates to the fact that the debuglink section created by portage lists the basename of the file with the debug information, which is basically the same name as the original with a .debug suffix. The reason why this is not just left to be intended is that if you look up the debuglink for libfoo.so you’ll see the real name might be libfoo.so.2.4.6.debug; on the other hand it’s far from trivial since it leaves something to be intended: the path to find the file into. By default all tools will be looking at the same path as the executable file, and prepend /usr/lib/debug to that. All well as long as there are no symlinks in the path, but if there are (like on multilib AMD64 systems), it starts to be a problem: accessing a shared object via /usr/lib/libfoo.so will try a read of /usr/lib/debug/usr/lib/libfoo.so.2.4.6.debug which will not exist (it would be /usr/lib/debug/usr/lib64/libfoo.so.2.4.6.debug). I have to track down and check if it’s feasible to use full canonicalised path for the debuglink; on the other hand that will assume that the root for the file is the same as the root of the system, which might not be the case. The third option is to use a debugroot-relative path, so that debuglink would look like usr/lib64/libfoo.so.2.4.6.debug; unfortunately I have no clue how gdb and company would take a debuglink like that, and I have to check it).

Problem does not stop here though; since packages collide one with the other when they try to install files with the same name (even when they are not really alternatives), I cannot rely to have all the packages installed in the tinderbox, which is actually making it even worse to analyse the symbol collisions dataset. So I should at least scan the data before merge on livefs is done, and load it in a database, indexed on a per-package per-slot basis, and then select the search that data to identify the problems. Not an easy or a quick solution.

Nor a complete one to be honest: the .symtab method will not show the symbols that are not emitted, like inlined functions; while we do want the unused symbols to be cut out, we still need static inlined functions names, since if a vulnerability is found there, it has to be found. I should check whether DWARF data is emitted for that at least but I wouldn’t be surprised if it wasn’t either. And also does not cope at all with renamed symbols, or copied code… So, still a long way before we actually can reassure users that all security issues are tackled down when found (and this does not limit to Gentoo, remember; Gentoo is the base tool I use to tackle the task, but the same problems involve basically every distribution out there).

Miracle on the nth try: OpenSolaris on KVM

So after my previous post about virtualisation software I decided to spend some extra time on trying out KVM, manually. Having to manually set the macaddress every time is a bit obnoxious but thanks to alias I can do that at least somewhat fine.

KVM is also tremendously faster compared with QEmu 0.10 using kqemu; I’m curious to see how the thing will change with the new 2.6.29 kernel where QEmu will be able to use the KVM device itself. At any rate, the speed of FreeBSD in the KVM virtual system is almost native and worked quite nicely. It also doesn’t hog the CPU when it’s idling, which is quite fine too.

As I’ve written, though, OpenSolaris also refused to start; after thinking a bit around, I thought about the amount of memory and… that was it. With the default 128MB of RAM provided by KVM and QEmu, OpenSolaris cannot even start the text-mode installation. Giving it 1 GB of memory actually made it work. Fun.

As Pavel points out in the previous post, though, the default QEmu network card will blatantly fail to work with OpenSolaris; Jürgen is right when he says that OpenSolaris is quite picky with its hardware. At any rate the default network card for KVM (RTL8169) seems to work just fine. And networking is not lagged like it is on VirtualBox, at all.

I’ve now been working on getting Gentoo Prefix on it already, and then I’ll probably resume my work on getting FFmpeg to build, since I need that to work on lscube . For now, though, it’s more a matter to have it installed.

Later this week I’ll probably also make use of its availability to work on Ruby-Elf more and in particular on the two scripts I want to write to help identify ABI changes and symbol collisions inside a given executable, that I promised in the other previous post .

RDEPEND safety

I’m hoping this post is going to be useful for all the devs and devs to be that want to be sure their ebuilds have proper runtime dependencies. It has sprouted by the fact it seems at least a few developers were oblivious of the implications of what I’m going to describe (which I described briefly on gentoo-core a few days ago, without any response).

First of all, I have to put my hands forwards and say that I’m going to focus on just the binary ELF packages, and this is far from a complete check for proper runtime dependencies. Scripting code is much more difficult to check, while Java is at least somewhat simpler thanks to the Java team’s script.

So you got a simple software that installs ELF executable fils or shared libraries, and you want to make sure all the needed dependencies are listed. The most common mistake there is to check the link chain with ldd (which is just a special way to invoke the loader, dumping out the called libraries). This would most likely show you a huge amount of false positives:

yamato ~ # ldd /usr/bin/mplayer
    linux-gate.so.1 =>  (0xf7f8d000)
    libXext.so.6 => /usr/lib/libXext.so.6 (0xf7eec000)
    libX11.so.6 => /usr/lib/libX11.so.6 (0xf7dfd000)
    libpthread.so.0 => /lib/libpthread.so.0 (0xf7de5000)
    libXss.so.1 => /usr/lib/libXss.so.1 (0xf7de1000)
    libXv.so.1 => /usr/lib/libXv.so.1 (0xf7ddb000)
    libXxf86vm.so.1 => /usr/lib/libXxf86vm.so.1 (0xf7dd4000)
    libvga.so.1 => /usr/lib/libvga.so.1 (0xf7d52000)
    libfaac.so.0 => /usr/lib/libfaac.so.0 (0xf7d40000)
    libx264.so.65 => /usr/lib/libx264.so.65 (0xf7cae000)
    libmp3lame.so.0 => /usr/lib/libmp3lame.so.0 (0xf7c37000)
    libncurses.so.5 => /lib/libncurses.so.5 (0xf7bf3000)
    libpng12.so.0 => /usr/lib/libpng12.so.0 (0xf7bcd000)
    libz.so.1 => /lib/libz.so.1 (0xf7bb9000)
    libmng.so.1 => /usr/lib/libmng.so.1 (0xf7b52000)
    libasound.so.2 => /usr/lib/libasound.so.2 (0xf7a9a000)
    libfreetype.so.6 => /usr/lib/libfreetype.so.6 (0xf7a13000)
    libfontconfig.so.1 => /usr/lib/libfontconfig.so.1 (0xf79e6000)
    libmad.so.0 => /usr/lib/libmad.so.0 (0xf79cd000)
    libtheora.so.0 => /usr/lib/libtheora.so.0 (0xf799b000)
    libm.so.6 => /lib/libm.so.6 (0xf7975000)
    libc.so.6 => /lib/libc.so.6 (0xf7832000)
    libxcb-xlib.so.0 => /usr/lib/libxcb-xlib.so.0 (0xf782f000)
    libxcb.so.1 => /usr/lib/libxcb.so.1 (0xf7815000)
    libdl.so.2 => /lib/libdl.so.2 (0xf7810000)
    /lib/ld-linux.so.2 (0xf7f71000)
    libjpeg.so.62 => /usr/lib/libjpeg.so.62 (0xf77ef000)
    librt.so.1 => /lib/librt.so.1 (0xf77e6000)
    libexpat.so.1 => /usr/lib/libexpat.so.1 (0xf77bf000)
    libogg.so.0 => /usr/lib/libogg.so.0 (0xf77b9000)
    libXau.so.6 => /usr/lib/libXau.so.6 (0xf77b4000)
    libXdmcp.so.6 => /usr/lib/libXdmcp.so.6 (0xf77ae000)

In this output, for instance, you can see listed the XCB libraries, and Expat, so you could assume that MPlayer depends on those. On the other hand, it really doesn’t, and they are just indirect dependencies, that the loader will have to load anyway. To avoid being fooled by that the solution would be to check the file itself for the DT_NEEDED entries in the .dynamic section of the ELF file. This can be achieved by checking the output of readelf -d or much more quickly by using scanelf -n:

yamato ~ # scanelf -n /usr/bin/mplayer
 TYPE   NEEDED FILE 
ET_EXEC libXext.so.6,libX11.so.6,libpthread.so.0,libXss.so.1,libXv.so.1,libXxf86vm.so.1,libvga.so.1,libfaac.so.0,libx264.so.65,libmp3lame.so.0,libncurses.so.5,libpng12.so.0,libz.so.1,libmng.so.1,libasound.so.2,libfreetype.so.6,libfontconfig.so.1,libmad.so.0,libtheora.so.0,libm.so.6,libc.so.6 /usr/bin/mplayer 

As you can see here MPlayer does not use either of those libraries, which means that they should not be in MPlayer’s RDEPEND. There is, though, another common mistake here. If you don’t use --as-needed (especially not forcing it), you’re going to get indirect and misguided dependencies . So you can only trust DT_NEEDED when the system has been built with --as-needed from the start. This is not always the case and thus you can get polluted dependencies. And thanks to the fact that now the linker silently ignores --as-needed on broken libraries this is likely to create a bit of stir.

One of the entries in my ever so long TODO list (explicit requests for tasks during donation helps, just so you know) is to write a ruby-elf based script that can check the dependencies without requiring the whole system to be built with --as-needed. It would probably be a lot like the script that Serkan pointed me at for Java, but for ELF files.

After you got the required dependencies are seen by the loader right, though, your task is not complete yet. A program has more dependencies that it might appear to have, since it might require data files to be opened, like icon themes and similar, but also more important dependencies in form of other programs or libraries. And that is not always too obvious. While you can check if the software is using the dlopen() interface to load dynamically further libraries, again using scanelf, that is not going to tell you much and you have to check the source code. Also the program can call another through way of the exec family of functions, or through system(). And even if your program does not call any of these functions you cannot be sure that you got the complete dependencies right without opening it

This is because libraries adds indirection to these things too. The gmodule interface in glib allows for dynamically loading plugins, and can actually load plugins you don’ t see and check, and Qt (used to) provide a QProcess class that allows to execute other software.

All in all, even for non-scripting programs, you really need to pay attention to the sources to be safe that you got your dependencies right and you should never ever rely purely on the output of a script. Which is another reason why I think that most work in Gentoo cannot be fully automated, not just yet at least. At any rate, I’m hoping to provide developers with an usable script one day soonish, at least it’ll be a step closer than it is now.