New Ruby eclasses: successes and setbacks

As for any other project involving a non-trivial effort, even the new Ruby eclasses (ruby-ng and ruby-fakegem), that I’ve started writing last May, will not bring only successes on the short term, but will also require some setbacks from time to time.

In this case, what is working out well is the new list of dependencies: we can easily make sure that all the packages that are needed for one extension to work are installed for the selected implementations. This really simplifies the amount of work both us and the users have to deal with when supporting multiple implementations.

What has been set back is proper, unmasked support for Ruby 1.9 and JRuby. The reasons for these setbacks are quite varied: we don’t have the latest version of JRuby available for instance (1.4 versus the 1.3.1 we have) because it fails its own tests; we’re also lacking the latest 1.9 version of MRI (the original Ruby that is), because Alex found a quite nasty segmentation fault (likely a race condition). But also some gems have broken tests that don’t take into account Ruby 1.9 behaviour, and others are simply broken with anything that is not MRI 1.8.

It’s interesting to note that while a lot of people insist that Ruby 1.9 is fine, even production ready, to the point that Daniel Robbins made it the default in Funtoo if I recall correctly, while even some pretty basic packages fail their tests badly when running (or built) against 1.9, including quite a few dependencies of Rails. And even when they have some kind of support for Ruby 1.9, they might uncover bugs from time to time.

One such case happened with ruby-prof: you might have noticed that it was bumped a couple of times in Gentoo in the last few days; the reason for that is that I was able to get upstream to fix up the tests so that they could be run with Ruby 1.9; while doing that, though, they found a bug in Ruby and after that, another one that Roger is still fleshing out for reporting.

For what concerns JRuby, the main problem as one might guess is interacting with native extensions: you cannot use native extensions directly in JRuby, obviously, since it’s written in Java. But you have some ways around that, as upstream reports: you can write a pure-Ruby implementation of your extension; you can use the dl extension to load a native library directly, to access its functions; or you can write the “native” part (which is usually the performance-bound part) in Java rather than C and get it to load a bytecode JAR as it was a native extension.

The problem here is that most of the developers out there are unlikely to pay equal attention to all the implementations, which often results in the “fallback” versions (pure Ruby and dl) to not be properly tested, if at all. And for those who do provide the Java-based code for their extensions, it gets even murkier. Some extensions only package in the gem the sources for either MRI (and compatible) and JRuby, and so you can’t just use one gem and build for both implementations (this is the case of Redcloth which is even further complicated, see later on). Others only package the pre-built jar file, and not the sources, in the distributed tarball. Others again, like it’s the case for rcov, ship with a pre-built jar and the sources, but trying to build the new jar fails badly. But it doesn’t stop there, as things are complicated by a lot of developers still only using Ruby 1.8.6 rather than 1.8.7 (which presents different failures, for instance with rcov again).

With Redcloth, the case is even worse: like Bluecloth, the modern extension can make use of a Ragel-based parser. It turns out that I’m one of the two maintainers in Gentoo for Ragel (because of feng ) so I would have no problem with making sure that it would be able to rebuild the parsers (eventually providing options to choose the method for the state machine generation). Unfortunately the source Ragel files (.rl) are missing both on the downloadable packages (gem and tarball) and on the GIT repository (which is pretty nasty). I tried contacting the author via twitter (given I couldn’t get a mail) but I have had no answer yet in about ten days.

But this is all talking about the setbacks; some success are, though, present, which is actually the only way I could be writing this here, now. Yes, because to make sure that the new fakegem-based ebuilds do work fine in production, my blog is moving to the new ebuilds as they come available. And today in particular, I put in production both the new ActiveRecord (and ActiveSupport) ebuilds, but also a new package: dev-ruby/pg that took the place of dev-ruby/ruby-postgres.

So if you can read this post, it means that my setup works, and thus that the ebuild we’re working on are not totally broken. And this, in my book, counts as success.

The battles and the losses

In the past years I picked up more than a couple of “battles” to improve Free Software quality all over. Some of these were controversial, like `–as-needed and some of them have been just lost causes (like trying to get rid of C++ strict requirements on server systems). All of those though, were fought with the hope of improving the situation all over, and sometimes the few accomplishments were quite a satisfaction by themselves.

I always thought that my battle for --as-needed support was going to be controversial because it does make a lot of software require fixes, but strangely enough, this has been reduced a lot. Most of the newly released software works out of the box with --as-needed, although there are some interesting exceptions, like GhostScript and libvirt. On the positive exceptions, there is for instance Luis R. Rodriguez, who made a new release of crda just to apply an --as-needed fix with a failure that was introduced in the previous release. It’s very refreshing to see that nowadays maintainers of core packages like these are concerned with these issues. I’m sure that when I’ve started working on --as-needed nobody would have made a new point release just to address such an issue.

This makes it much more likely for me to work on adding the warning to the new --as-needed and even more needed for me to find why ld fails to link PulseAudio libraries even though I’d have expected him to.

Another class of changes that I’ve been working on that have shown more interest around than I would have expected is my work on cowstats which, for the sake of self-interest, formed most of the changes in the ALSA 1.0.19 release for what concerns the userland part of the packages (see my previous post on the matter).

On this case, I wish first to thank _notadev_ for sending me Linkers and Loaders, that is going to help me improve Ruby-Elf more and more; thanks! And since I’m speaking of Ruby-Elf, I finally decided its fate: it’ll stay. My reasoning is that first of all I was finally able to get it to work with both Ruby 1.8 and 1.9 adding a single thin wrapper (that is going to be moved to Ruby Bombe once I actually finish that), and most importantly, the code is there, I don’t want to start from scratch, there is no point in that, and I think that both Ruby 1.9 and JRuby can improve from each other (the first losing the Global Interpreter Lock and the other one trying to speed up its starting time). And I could even decide to find time to write a C-based extension, as part of Ruby-Bombe, that takes care of byteswapping memory, maybe even using OpenMP.

Also, Ruby-Elf have been serving its time a lot with the collision detection script which is hard to move to something different since it really is a thin wrapper around PostgreSQL queries, and I don’t really like to deal with SQL in C. Speaking about the collision detection script, I stand by my conclusion that software sucks (but proprietary software stinks too).

Unfortunately while there are good signs to the issue of bundled libraries, like Lennart’s concerns with the internal copies of libltdl in both PulseAudio (now fixed) and libcanberra (also staged for removal) the whole issue is not solved yet, there are still packages in the tree with a huge amount of bundled libraries, like Avidemux and Ardour, and more scream to enter (and thankfully they don’t always do). -If you’d like to see the current list of collisions, I’ve uploaded the LZMA-compressed output of my script.- If you want you can clone Ruby-Elf and send me patches to extend the suppression files, to remove further noise from the file.

At any rate I’m going to continue my tinderboxing efforts, while waiting for the new disks, and work on my log analyser again. The problem with that is I really am slow at writing Python code, so I guess it would be much easier if I were to reimplement the few extra functions that I’m using out of Portage’s interface in Ruby and use those, or find a way to interface with Portage’s Python interface from Ruby. This is probably a good enough reason for me to stick with Ruby, sure Python can be faster, sure I can get better multithreading with C and Vala, but it takes me much less time to write these things with Ruby than it would take me in any of the other languages. I guess it’s a problem with the mindset.

And on the other hand, if I have problems with Ruby I should probably just find time to improve the implementation; JRuby is enough evidence to show that my beef against Ruby 1.9 runtime not supporting multithreading are an implementation issue and not a language issue.

The end of Ruby-Elf?

One thing that always bothered me of Ruby-Elf and its tools (cowstats, the linking collision script and the rest) is that they don’t really make good use of the 8-way system I have as main workstation, which is not really good considering that it also means that the cowstats run after each emerge in my main system blocks on a single core, and I don’t even want to try it on tinderbox as a whole. It also means I cannot replace scanelf with a similar script in Ruby (neither are parallelised but the C-based scanelf is obviously faster).

To address this problem, I considered moving to JRuby as interpreter; it’s already using native threading with the 1.8 syntax, and it would have been decently good to get cowstats multithreaded, the problem is that the startup time is considerable, which wasn’t very good to begin with. So I decided to bite the bullet and try Ruby 1.9 to see how it performed.

Beside some slight syntax change, I started already having problems with Ruby 1.9 and Ruby-Elf. The testsuite is still written using Test::Unit, because RSpec didn’t suit my needs well at all, and for that reason I prepared an ebuild (masked for now; remember I hate overlays unless very necessary, I’ll go deeper inside on that issue in the next weeks hopefully) for the improved test-unit extension (not using gem, as usual). It should work with Ruby 1.8 too, although I found some test failures on the test framework itself with both 1.9 and 1.8.

The following problem has been with the readbytes.rb interface I am using, which is gone in 1.9, stating that the interface is implemented already in the IO class. Unfortunately the only interface that gets near is IO#readpartial but it’s not actually the same thing and has a quite different meaning in my opinion, but again, let’s not get anal about that, it could be fixed quite easily.

What became a big problem are the changes in the String class, which is now encoding-aware, and it expects each elements in it to be a character. While this is tremendously good since String would then work more like a String than a character array (like is done in the underlying C language), it lacked a parallel ByteArray object for handling sequences of bytes, and a binary file interface into IO. This is a very huge deal because the whole of Ruby-Elf is doing little more than binary file parsing.

Now to be honest I didn’t spend too much time running through the changelogs of Ruby 1.9 to identify all the changes, but since the changes seem to be quite huge by themselves, and I could even get simpler binary file handling in PHP than what I’ve seen in the two hours I spent trying to force Ruby 1.9 under submission, I’m afraid to say that Ruby-Elf is going to stagnate and I’ll end up looking at a different language to implement the whole thing.

Luca suggested (as usual) for me to look at Python, but I don’t really like Python that much myself, while the forced indentation may help to make the code more readable, take a look at Brian’s (ferringb) code and you will never say that Perl is the only SSL-based language (sorry Brian, but you know that I find your code quite encrypted). I’m sincerely considering the idea of moving to C#, given that the whole Mono runtime is adding less startup overhead than Java/JRuby would.

Or I could go for the alternative route and just try to write it in Objective C.