Ruby-Elf and collision detection improvements

While the main use of Ruby-Elf for me lately has been quite different – for instance with the advent of elfgrep or helping verifying LFS support – the original reason that brought me to write that parser was finding symbol collisions (that’s almost four years ago… wow!).

And symbol collisions are indeed still a problem, and as I wrote recently they don’t get very easy on the upstream developers’ eyes, as they are mostly an indication of possible aleatory problems in the future.

At any rate, the original script ran overnight, generated a huge amount of database, and then required more time to produce a readable output, all of which happened using an unbearable amount of RAM. Between the ability to run it on a much more powerful box, and the work done to refine it, it can currently scan Yamato’s host system in … 12 minutes.

The latest set of change that replaced the “one or two hours” execution time with the current “about ten minutes” (for the harvesting part, there are two more minutes required for the analysis) was part of my big rewrite of the script so that it used the same common class interfaces as the commands that are installed to be used with the gem as well. In this situation, albeit keeping the current single-threaded (more on that in a moment), each file analysed consists of three calls to the PostgreSQL backend, rather than being something in the ballpark of 5 plus one per symbol, and this makes it quite faster.

To achieve this I first of all limited the round-trips between Ruby and PostgreSQL when deciding whether a file (or a symbol) has been already added or not. In the previous iteration I was already optimising this a bit by using prepared statements (that seemed slightly faster than direct queries), but they didn’t allow me to embed the logic into them, so I had a number of select and insert statements depending on the results of those, which was bad not only because each selection would require converting data types twice (from PostgreSQL representation to C, then from that to Ruby), but also because it required to call into the database each time.

So I decided to bite the bullet and, even though I know it makes it a bunch of spaghetti code, I’ve moved part of the logic in PostgreSQL through stored procedures. Long live PL/SQL.

Also, to make it more solid in respect to parsing error on single object files, rather than queuing all the queries and then commit them in one big single transaction, I create single transactions to commit all the symbols of an object, as well as when creating the indexes. This allows me to skip over objects altogether if they are broken, without stopping the whole harvesting process.

Even after introducing the transaction on symbols harvesting, I found it much faster to run a single statement through PostgreSQL in a transaction, with all the symbols; since I cannot simply run a single INSERT INTO with multiple values (because I might hit an unique constrain, when the symbols are part of a “multiple implementations” object), at least I call the same stored procedure multiple times within the same statement. This had tremendous effect, even though the database is accessed through Unix sockets!

Since the harvest process now takes so little time to complete, compared to what it did before, I also dropped the split between harvest and analysis: analyse.rb is gone, merged into the harvest.rb script for which I have to write a man page, sooner or later, and get installed properly as an available tool rather than an external one.

Now, as I said before, this script is still single-threaded; on the other hand, all the other tools are “properly multithreaded”, in the sense that their code fires up a new Ruby thread per each file to analyse and the results are synchronised not to step on each other’s feet. You might know already that, at least for what concerns Ruby 1.8, threading is not really implemented and green threads are used instead, which means there is no real advantage in using them; that’s definitely true. On the other hand, on Ruby 1.9, even though the pure-Ruby nature of Ruby-Elf makes the GIL a main obstacle, threading would improve the situation by simply allowing threads to analyse more files while the pg backend gem would send the data over to PostgreSQL (which would probably also be helped by the “big” transactions sent right now). But what about the other tools that don’t use external extensions at all?

Well, threading elfgrep or cowstats is not really any advantage on the “usual” Ruby versions (MRI18 and 1.9), but it provides a huge advantage when running them with JRuby, as that implementation has real threads, it can scan multiple files at once (both when using asynchronous listing of input files with the standard input stream, and when providing all of them in one single sweep), and then only synchronise to output the results. This of course makes it a bit more tricky to be sure that everything is being executed properly, but in general makes the tools just the more sweet. Too bad that I can’t use JRuby right now for harvest.rb, as the pg gem I’m using is not available for JRuby, I’d have to rewrite the code to use JDBC instead.

Speaking about options passing, I’ve been removing some features I originally implemented; in the original implementation, the arguments parsing was asynchronous and incremental, without limits to recursion; this meant that you could provide a list of files preceded by the at-symbol as the standard input of the process, and each of that would be scanned for… the same content. This could have been bad already for the possible loops, but it also had a few more problems, among which there was the lack of a way to add a predefined list of targets if none was passed (which I needed for harvest.rb to behave more or less like before). I’ve since rewritten the targets’ parsing code to only work with a single-depth search, and relying on asynchronous arguments passing only through the standard input, which is only used when no arguments are given, either on command line or by default of the script. It’s also much faster this way.

For today I guess all these notes about Ruby-Elf would be enough; on the other hand, in the next days I hope to provide some more details about the information the script is providing me.. they aren’t exactly funny, and they aren’t exactly the kind of things you wanted to know about your system. But I guess this is a story for another day.

Ruby-elf and documentation

After my checklist post I got asked for some documentation about ruby-elf tools like cowstats and missingstatic.

As it turns out I wrote little to no documentation at all, and I relied exclusively on the scripts being self-documenting, for the most part. Probably not a good idea if I want to have a broader audience.

For this reason, I think I’ll start by writing some man pages for the tools, hopefully today or tomorrow, before I get to the hospital again. I’ll see also to actually release a version of this so I can add it to portage too, so that it’s actually available for developers who are interested (for now you can get it from my overlay as dev-ruby/ruby-elf-9999.

I also started working on improving the way cowstats decides what whether a symbol is in a copy on write section or not. Before I only used the name of the section and, as it turns out, I used to ignore the TLS sections (no, not SSL successor but Thread-local storage).

The TLS problem is solved now but I decided using the name of the section to decide whether it’s CoW or not is not very feasible. I added code that checks the type and the flags of the sections, to an extent, so that it ignores automatically all the sections containing executable code, and all the read-only sections. It also considers .bss and equivalent sections just by type rather than by name (if I did this in the first place I would have supported .tbss in the first place too).

On a different note, I forgot to write that while I was hospitalised, my Nokia decided to go crazy and corrupted the fring app I was using to chat from the E61 itself. I think (and from one side hope) that the MiniSD I was using was broken, because then the rest of the phone would be fine. The problem is that the internal memory is very tiny and the MiniSD that Nokia gave me with the phone, which I just put back in it, is half full of Nokia’s own software, like the MailForExchange launcher (which I don’t care of, or TravelMate). I think I’ll have to pick up a new MiniSD hoping that will work. Last time I bought a Corsair 1GB, this time I think I’ll stop with a Trascend one as they never failed me up to now. Interestingly enough, at my supplier, the MiniSD card would be pretty cheap (€5) while the shipping costs would be over that price. I should check if they have cheap SD cards too, in the stores around here they are tremendously expensive still (€10 for a 2GB card!).