Bundler and JRuby — A Ruby Rant

This post is supposed to be auto-posted on Saturday, March 3rd, since I’m testing the new Typo 6 — I used to write posts in advance and let them be posted, but then the support broke, I hope the new version supports it again. If not, it might be a good excuse to fork Typo altogether.

So after a quick poll on Twitter it seems like people are not really disliking my series of Ruby rants, so I think I’ll continue writing a bit more about them. Let me start with making it clear that I’m writing this with the intent of showing the shortcoming of a development environment – and a developers circle – that has closed itself up so much lately that everything is good as long as some other Ruby developer did it before. This is clearly shown on some of the posts defending bundler in the reddit comments about one of my previous posts.

For what concerns Bundler, I don’t think I’m blaming the wrong technology. Sure, Bundler has simplified quite a bit the handling of dependencies and that’s true for Gentoo developers as well (it’s still better than the stupid bundling done by the Rails components in 2.3 series), but the problem here is that it’s way too prone to abuse, especially due to the use of “one and only one” dependencies. I could show you exactly what you mean by simply pointing out that the bourne I already complained about depends on a version of mocha (0.10.4) that is broken on Ruby 1.9 (neither 0.10.3 nor 0.10.5 suffer from this problem, and bourne works fine with either, but they still depend on the only broken one because that was out when they released bourne-1.1.0), but I’ll show you exactly how screwed the system is with a different issue altogether.

Can you vouch that Bundler works on JRuby just as well as it works on MRI 1.8 and 1.9?

If you just answered “Yes”, you have a problem: you’re overconfident that widely used Ruby gems work just because others use them. And this kind of self-validation is not going to let you understand why my work (and Hans’s, and Alex’s) is important.

Let’s take a step back again; one of the proclaimed Ruby strengths is always considered to be the fact that most developers tie long and complex tests in their code so that the code is behaving as expected when changing dependencies, implementation and operating systems. While this can build confidence on one’s code, the sheer presence of these tests is not going to be enough, as many times the developer is the only one running them, and he or she might not have many different operating systems to try it on (multiple implementations are somewhat solved by using RVM and similar projects). Of course it is true that Travis is making it more accessible to try your code on as many different configurations as possible, but it’s still no magic wand.

So there are tests in the libraries, and that’s good — tests are also supposed to be installed by RubyGems, and it should be possible to run them through the gem package manager itself (although I can’t find an option to do that right now), but sometimes you end up with one file missing here or there; in some cases it’s datafiles that the tests use (rather than the test files themselves), or it’s a configuration file that is required for them to run — lately, the nasty and ironic part is that it’s the Gemfile file that is missing from the .gem packaging, making it impossible to run the tests from where, especially when dependencies are only listed there.

An aside, for those of you who might want to find me at fault; I only own a single RubyGems package: Ruby-Elf and it does not come with tests. The reason is that the test data itself is quite big, and most of the Gem users wouldn’t care about the tests anyway (especially since I couldn’t find how to run them). For this reason I decided that the .gem package is only used for execution, and not for packaging. The tarball is always released and tagged so you should use that for packaging; which is exactly what dev-ruby/ruby-elf does.

Okay now you’re probably confused: what has it to do with JRuby that some gems lack a Gemfile (which is used by bundler)? Nothing really, I was just trying to give you an idea of how tricky it can be for us to make sure that the gems we package actually work. I’m now going back to Bundler.

So Bundler comes with its own set of tests — and most importantly it’s designed to be tested against different versions of RubyGems library which is very important for them since they hook deep into the RubyGems API — to the point one wonders if it wouldn’t make more sense to merge Bundler’s dependency rigging and RubyGems loading code in one library, and the bundle and gem commands on a package manager tool project. This is good although we might not want to test it with a number of RubyGems libraries but just the one we have installed. Different target, same code to leverage though.

Anyway, Bundler’s tests work fine in Gentoo for both Ruby 1.8, 1.9 and Ruby Enterprise… but they can’t work with JRuby, and it’s not just a matter of Gentoo packaging. The problem is that Bundler’s Rakefile insists on running tests related to the manual and documentation as well. It’s not too bad, indeed the bundle documentation is installed as man pages, even though these man pages are not installed in the standard man directory, so while they are displayed on bundle --help they are not available to man bundle. I still find it nice: it’s not too different from what I’ve done with Ruby-Elf, where the tools’ help is only available as man pages (but the Gentoo packaging make them available to man as well).

These man pages are generated through a tool that could probably interest me as well — the current man pages for Ruby-Elf are generated through DocBook, which does make them more well organized, but is still one hefty dependency. Anyway the tool is ronn which is a Ruby gem that builds man pages out of Markdown-written text files. The problem is that rather than leaving it possible for the man pages to be generated beforehand, the Rakefile wants to make sure ronn can be called by the current Ruby implementation.

Which should be easy: since ronn is written in Ruby, just make it possible to install it for JRuby. It would be easy indeed if not for the library it uses for Markdown parsing and output generation; it’s not the usual BlueCloth that we’re mostly used to, instead it uses rdiscount which is another (faster?) implementation of Markdown, which is based once again on a C native extension. Being a C extension, that is not available to JRuby, which means that you can’t use ronn with JRuby as it is now.

Since both BlueCloth and rdiscount are based on C extensions, one would wonder how do you make it possible for JRuby to handle Markdown? The answer comes with another gem, which I have learnt about while bumping Radius (a support gem for Radiant), and is called kramdown and is a pure-Ruby, yet fast, implementation of Markdown. For basic usage it’s identical to use to BlueCloth to the point that my monster switched this week from BlueCloth to kramdown with just two lines changes (one to the Gemfile the other to the code).

So I reported to Charles that Bundler can’t have its tests running with Bundler, and he obviously suggested me to poke the Bundler developers to see if they can help out to fix this up. Unfortunately the suggestion is something like you can patch it up but we won’t do the work for you — which is understandable to a point.

The problem with this is that this means nobody has ever tested Bundler on JRuby; if you’re using it there, then you’re assuming it works because everyone else is using it, but there might well be a completely unknown, nasty bug that is just waiting to eat one user’s work on a corner case.

If you do care about JRuby, one very nice thing you could do here is getting in touch with ronn’s upstream, and convert it to use kramdown; then it should be possible to have a fully functional Bundler, relying solely on JRuby. Until that’s done, Gentoo’s JRuby will lack Bundler, and that also means it’ll lack a long list of other software.

Ruby-Elf and collision detection improvements

While the main use of Ruby-Elf for me lately has been quite different – for instance with the advent of elfgrep or helping verifying LFS support – the original reason that brought me to write that parser was finding symbol collisions (that’s almost four years ago… wow!).

And symbol collisions are indeed still a problem, and as I wrote recently they don’t get very easy on the upstream developers’ eyes, as they are mostly an indication of possible aleatory problems in the future.

At any rate, the original script ran overnight, generated a huge amount of database, and then required more time to produce a readable output, all of which happened using an unbearable amount of RAM. Between the ability to run it on a much more powerful box, and the work done to refine it, it can currently scan Yamato’s host system in … 12 minutes.

The latest set of change that replaced the “one or two hours” execution time with the current “about ten minutes” (for the harvesting part, there are two more minutes required for the analysis) was part of my big rewrite of the script so that it used the same common class interfaces as the commands that are installed to be used with the gem as well. In this situation, albeit keeping the current single-threaded (more on that in a moment), each file analysed consists of three calls to the PostgreSQL backend, rather than being something in the ballpark of 5 plus one per symbol, and this makes it quite faster.

To achieve this I first of all limited the round-trips between Ruby and PostgreSQL when deciding whether a file (or a symbol) has been already added or not. In the previous iteration I was already optimising this a bit by using prepared statements (that seemed slightly faster than direct queries), but they didn’t allow me to embed the logic into them, so I had a number of select and insert statements depending on the results of those, which was bad not only because each selection would require converting data types twice (from PostgreSQL representation to C, then from that to Ruby), but also because it required to call into the database each time.

So I decided to bite the bullet and, even though I know it makes it a bunch of spaghetti code, I’ve moved part of the logic in PostgreSQL through stored procedures. Long live PL/SQL.

Also, to make it more solid in respect to parsing error on single object files, rather than queuing all the queries and then commit them in one big single transaction, I create single transactions to commit all the symbols of an object, as well as when creating the indexes. This allows me to skip over objects altogether if they are broken, without stopping the whole harvesting process.

Even after introducing the transaction on symbols harvesting, I found it much faster to run a single statement through PostgreSQL in a transaction, with all the symbols; since I cannot simply run a single INSERT INTO with multiple values (because I might hit an unique constrain, when the symbols are part of a “multiple implementations” object), at least I call the same stored procedure multiple times within the same statement. This had tremendous effect, even though the database is accessed through Unix sockets!

Since the harvest process now takes so little time to complete, compared to what it did before, I also dropped the split between harvest and analysis: analyse.rb is gone, merged into the harvest.rb script for which I have to write a man page, sooner or later, and get installed properly as an available tool rather than an external one.

Now, as I said before, this script is still single-threaded; on the other hand, all the other tools are “properly multithreaded”, in the sense that their code fires up a new Ruby thread per each file to analyse and the results are synchronised not to step on each other’s feet. You might know already that, at least for what concerns Ruby 1.8, threading is not really implemented and green threads are used instead, which means there is no real advantage in using them; that’s definitely true. On the other hand, on Ruby 1.9, even though the pure-Ruby nature of Ruby-Elf makes the GIL a main obstacle, threading would improve the situation by simply allowing threads to analyse more files while the pg backend gem would send the data over to PostgreSQL (which would probably also be helped by the “big” transactions sent right now). But what about the other tools that don’t use external extensions at all?

Well, threading elfgrep or cowstats is not really any advantage on the “usual” Ruby versions (MRI18 and 1.9), but it provides a huge advantage when running them with JRuby, as that implementation has real threads, it can scan multiple files at once (both when using asynchronous listing of input files with the standard input stream, and when providing all of them in one single sweep), and then only synchronise to output the results. This of course makes it a bit more tricky to be sure that everything is being executed properly, but in general makes the tools just the more sweet. Too bad that I can’t use JRuby right now for harvest.rb, as the pg gem I’m using is not available for JRuby, I’d have to rewrite the code to use JDBC instead.

Speaking about options passing, I’ve been removing some features I originally implemented; in the original implementation, the arguments parsing was asynchronous and incremental, without limits to recursion; this meant that you could provide a list of files preceded by the at-symbol as the standard input of the process, and each of that would be scanned for… the same content. This could have been bad already for the possible loops, but it also had a few more problems, among which there was the lack of a way to add a predefined list of targets if none was passed (which I needed for harvest.rb to behave more or less like before). I’ve since rewritten the targets’ parsing code to only work with a single-depth search, and relying on asynchronous arguments passing only through the standard input, which is only used when no arguments are given, either on command line or by default of the script. It’s also much faster this way.

For today I guess all these notes about Ruby-Elf would be enough; on the other hand, in the next days I hope to provide some more details about the information the script is providing me.. they aren’t exactly funny, and they aren’t exactly the kind of things you wanted to know about your system. But I guess this is a story for another day.

Rubygems… UNHACKED!

I have written yesterday about the difficulty of removing the Rubygems hacks we’ve been using — well, today I got a good news: I was able to mostly remove them. I say mostly because there are still a couple of things that need to be fixed, with the help of upstream (all my changes are available on my Rubygems fork in the gentoo branch):

  • there is no way in the original sources to change the location of the system configuration file; for Windows it’s found on the register, for everyone else it’s /etc; I’ve had to change the sources to allow for overriding that with the operating system defaults;
  • there is one test that actually only works if there is no alternate default set installed, as it checks that the binary directory of the gems is the same as the one for Ruby; that is no longer the case for us;
  • JRuby … pretty much fails a whole bunch of tests; some it’s not really its fault, for instance it lacks mkmf/@extconf@ since that there are no C-compiled extensions; others are caused by different problems, such as tests’ ordering or the huge difference in handling of threading between the original implementations and JRuby;
  • I had to patch up a bit the Rakefile so that it can be used without Rubyforge support, which was a requirement for Ruby Enterprise (well at least for my system; the problem is that it still does not build with OpenSSL 1.0, so I have a non-OpenSSL Ruby Enterprise install… and that means that Rubyforge can’t load, and even so the Rubyforge plugin for Hoe);
  • documentation fails to build, not sure on why or how, but it does, when using the rake docs command; I’ll have to check out why and see if I can get it fixed.

But now back to the important good news: you can now safely use the gem command as root! Gems installed with sudo gem install foo will install in /usr/local rather than directly in /usr, no longer colliding or clashing with the Portage-installed packages (which are officially supported). Obviously, both are searched, but the local variant would take precedence, and the user’s home gems get even higher priority for search. This also means that if you don’t care about Gentoo support for the Ruby extensions, you can simply use gem and be done with it.

Now moving a bit back to not-too-good news and next steps. The not-too-good news is that since this dropped all the previously-present hacks, including a few needed to get gem from within Portage, this new version is not compatible with the old gems.eclass. Well, it’s relatively not good news, and partially good news; this is one of the final nails in that eclass’s coffin; we have now a very good reason to get rid of all the remaining packages; soon.

*Note: I have said before that we still lack a way to properly build bindings that are part of bigger packages; luckily none of those require gems.eclass, they rather use the old ruby.eclass which does not require rubygems at all. So it’s fine to deprecate and get rid of the uses of that right now even though we might have more work to do before ruby-ng takes over everything.*

Next steps for what concerns the rubygems package itself, if it was up to me, would be to drop the four gem18, gem19, gemee18 and jgem: all the ruby-fakegem.eclass binary wrappers right now install a single script, which by default uses the currently-configured Ruby interpreter, and you simply have to use ruby19 -S command to start it with a different interpreter. But gem itself is not treated this way and we rather have four copies of it with different names, and different shebangs, which sounds a waste. To change this, among other things, we need to change the eselect-ruby module (which I sincerely would avoid touching if possible).

Further step: supporting multiple Ruby implementation with the current system is mostly easy… but we have no way to change the current interpreter on a per-user or per-session basis; this is something that, as far as I can tell, we could actually take out of the Python book (or maybe the Java book, both have something like that, but the Python failures actually teach us one very important thing: we cannot use a script, we have to use a binary to do that job, even if it’s a very stupid one, as older Linux and other non-Linux systems will fail if you chain interpreters). Again, an eselect-ruby task… I would probably wait for Alex, unless there is enough interest for me to work on it.

Then let’s see, we’ve got to fully migrate out of the old-style virtual/ruby dependencies into =dev-lang/ruby-1.8* so that we can actually implement a new-style virtual that proxies the ssl USE flag, and then start using that; we’ve got to implement a general way to handle one-out-of-multiple target handling for packages that use Ruby as an embedded interpreter rather than build libraries for it; then there is again the problem of finding a way to build bindings within larger packages (or move some of them to build the Ruby extensions by themselves — I remember obexftp could use something like that), there is the fake-specification-generation to be fixed so that it works with bundler, and there are some (corner?) cases where Ruby 1.9 complain about missing fields in our generated files. So as you can see there is a long way still, but we’re going up there, step by step.

And before I leave, I’d like to thank Zeno Davatz for funding the un-hacking of rubygems — if that wasn’t the case, I’d probably have had to post this next week, or month. And if there are other people wanting Ruby 1.9 available faster (or improved 1.8 support), I’m still available to be hired, you can contact me and provide me with an idea of what you’re looking for. I’m available both to implement particular feature pieces, porting of the web of dependencies of given gems, or even just on a retainer basis so that you can have “priority” support for what concerns the general extension packaging in Gentoo.

Hacking is easy…

un-hacking definitely not.

Since both Zeno and Jeremy reported a few things about rubygems lately, I ended up deciding to take a look at the rubygems code and see if it was possible to un-hack it so that, from one side, it follows upstream more closely, and from the other, it actually works in Gentoo without risks.

Now, Gentoo has currently a number of hacks over Rubygems (the library, and the package manager) for two main reason: supporting the Portage-based install of gems in the old manner (as in calling gem from within the eclass), and supporting multiple Ruby implementations to be installed at the same time.

For what concerns the fist case, thanks to RubyNG we no longer need the hacks at all, and so we can drop them (and fix the dependencies so that the new, unhacked version is not brought in with any of the old versions).

And luckily for me, even the multi-Ruby support is now getting reduced; in particular now 1.8 and 1.9 (from 1.9.2_rc2 onward) install in the same sub-tree (/usr/lib/ruby) rather than in two separated. Less fortunately, Ruby Enterprise still install in a quite different tree (/usr/lib/rubyee) so it still requires changes.

More importantly, now the installation to /usr is handled through Portage, so Rubygems would have to install on a different tree, such as /usr/local; to do that, it requires some changes around the paths and.. doesn’t that sound like hacking and patching? Well, I have to say this about the current Rubygems developers: they thought about it already!

So indeed, we can install our own default configuration replacement that can be used to set up default search paths and so on so forth. Good! Almost perfect, almost because it still has one /etc hit hardcoded that we’d have to change for prefix support, but I have a patch to be sent upstream that should be acceptable to them as well as usable for us. By using this, we can also avoid two implementation-dependent patches… and I have to tell you this: implementation-dependent patches suck, and will suck even more when we’ll have filesystems capable of runtime data de-duplication — and for archives.

For the next part of the complexity issues, you have the current install phase, that actually relies on the Rubygems’s own install code, which then requires a number of directories to be created beforehand and so on, so forth. The solution now is simply to hand down the installation to ruby-ng.eclass which simplifies it a lot.

But since I’m doing radical changes to the Rubygems installation we had before, I thought it would have been a good idea to do something that up to now has been ignored: adding the tests; Rubygems is said to need itself to actually work… in truth what it needs is Rake… and a few more gems that do require it to be installed. Oh well, once again we can do that nowadays, so I wired up the tests; beside one test (the one checking for bindir) being totally bogus, as it fails as soon as you rewire that path, the tests worked not-too-fine-but-well-enough on Ruby 1.8, and 1.9. JRuby was another story altogether.

The problem with all the implementations’ tests is that they are actually incompatible with a few gems, such as YaRD (because it tries to replace RDoc, and Rubygems tests RDoc) and Test::Unit 2 (that’s actually very very common). Given that I actually care about running these tests, the solution has been to simply… block on the presence of those two. This is something that we’ll be doing on more ebuilds until we can work out a solution; most people wouldn’t have Test::Unit 2 merged for development (my suggestion on how to write tests? Use RSpec! Or Tryouts 2… both have the same implementation on both Ruby 1.8 and 1.9, while minitest and test-unit both have different, not-entirely-compatible, implementations), so it’s acceptable.

For what concerns JRuby in particular, though, the problem runs much deeper; so many tests fail that it’s not even funny, but it seems like the same is true for the vanilla version; at least two of the failures I found are related to the code that JRuby inject (with the same support that allows us to load Gentoo-specific configuration), which changes the original behaviour of the library, possibly trying to improve it, but as it is, most likely breaking the intended behaviour. Having to back-patch this, is definitely a bit tricky.

But the unhacking does not stop here, since I had to install Ruby-Enterprise, and it does not build with SSL enabled, I ended up noting that there are a number of packages that require the interpreter to be built with SSL support, and right now we have no way to express that. I guess we’ll get a new-style virtual/ruby this week.

Alas.

Ruby-NG: Too Frequently Asked Questions

Okay I’d sincerely would like to stop blogging about Ruby and actually working on making it better, but it seems like that one particular user is trying to start up fires all around the Ruby Team to push for his agenda… without seeming to care at all about the results. So here comes a list of common questions and mostly-official answers.

I forgot one of the most important questions! Added afterwards!

Why is Ruby 1.9 masked? It’s stable upstream!!!! The exclamation marks are there because that’s how we receive them oftentimes. Sure, Ruby upstream seems to declare that Ruby 1.9 branch is “stable”, by their standards already, which I’ll have to tell you all, are not very solid. To be precise, they mark 1.9.1 stable, while 1.9.2 is still “development” and is currently a Release Candidate. While the code for 1.9 sounds more solid, the changes between 1.9.1 and 1.9.2 are not trivial, and wouldn’t have been considered for most other projects worth of “minor versions”. To make a comparison, they really look like the difference that there would be between GCC 4.5 and GCC 4.6, while the changes between 1.8.7 and 1.9.1 are like those between GCC 4 and GCC 5.

What we’re actually targeting for unmasking at some point in the hopefully not-so-distant future, is 1.9.2, not 1.9.1, so no, our target is not considered stable upstream yet.

Just unmask 1.9.1 then! Unfortunately, that’s not feasible. The reasons are long and are (partly) on our side: we’ve not had a proper way to handle extensions’ packaging until I created Ruby-NG in May 2009 – all the previous ebuilds used various series of hacks and dirty tricks to allow side-by-side installation of Ruby packages, but none actually worked properly. Even worse for the gem-based packages. From May 2009, we’ve been refining techniques, eclasses and porting packages, but the work is still not complete. There will be much more to deal with before unmasking. And that means that our most likely candidate is going to be 1.9.2 rather than 1.9.1.

This does not mean that we’ve not tried targeting 1.9.1; I actually did a lot of porting to 1.9.1… and now that we’ve turned to look at 1.9.2 I had to do further porting.. wasting twice the time to identify and fix issues. Trust me it’s not funny.

Why doing the work? Leave it to Rubygems! Haha, very funny. I’ve written a long time ago about the reasons why we don’t consider the Rubygems package manager apt to production use. Even though a number of people find it just perfect, we’ve got our reserves, and feel like Gentoo should aim to provide higher-quality packaging to our users, since we actually can.

In particular you can manage a remote system with binary packages without having to keep around the development tools thanks to our packaging, as you’d be building the binary gems together with normal Gentoo packages, and abiding to the settings you provide to Portage.

Of course you can use the standard gem command, but if you do so, particularly as root, you “void the warranty” and the Gentoo Ruby team won’t respond for any problem you might experience.

Why do I see so many tests failing? I didn’t see them at all before! You broke something! Not really; by default gems don’t execute tests; sure there is an option in the Rubygems package manager that provides support for running tests, but the number of gems that actually specify their test dependencies, and the way to run the tests, is risible. And with our previous packaging system, the tests simply weren’t executed. So all the failures you might experience with the new ebuilds based on Ruby-NG, are for the vast part coming directly from upstream.

To be fair to most upstreams, some tests are either designed to work on older Ruby versions, on other operating systems (such as Mac OS X that is case-insensitive — bundler had a problem related to that!), or with a particular series of installed gems. Which means that they might not have seen the failure at all. And some failures are actually coming from our environment; in particular lately the tinderbox is reporting me a spree of failures happening when test-unit 2.1.1 is installed for Ruby 1.8 (and would for Ruby EE if I was testing that).

We’re doing our best to actually fix the test failures, sending them upstream, as you can see from my own GitHub page that has forks for a number of Ruby projects for which I sent one or two patches to fix tests. This, though, is often a non-trivial yet boring work and we’re just four guys trying hard to get results.

Why is package #{foo} not available for 1.9? Upstream says it works! This usually has Rails (2.3) in place of #{foo}… yes upstream sometimes says their package works with 1.9, and most times they are right, but some other times, they are definitely not. Rails is a good example of the latter. It “works” in the sense that you can use it for some of its features, it does not work as in “everything is fine and smooth”. Even its own tests, for what concerns Rails, are totally bogus on 1.9. It gets worse for binary extensions (bindings).

There is a forked gem/patch to get #{bar} binary extension to build with 1.9, why you insist it’s not working on 1.9? This gets fun, and I actually written explicitly about it — sometimes the patched versions do exactly what it’s written on the box: they build on 1.9, but that says nothing about them working at all. Since a lot of symbols (functions and macros) have gone from 1.9, even fixing the extension to build might still entail using symbols that are no longer available, through implicit declarations. The original Ruby 1.8 and 1.9 build systems leave that be, and depending on the settings of the system this will lead to either non-loadable extensions, or runtime suicide of the Ruby process, because of missing symbols.

Personally, I don’t like having timebombs in my packages, so under the assumption that I’d rather have a build-time failure – or no package at all – than one that can kill my software stack at runtime, I’ve made our Ruby packages inject -Wl,--no-undefined so that similar problems are caught at build-time already. At least two packages that patched for Ruby 1.9 support in the overlay before failed this very test and were thrown out of the Ruby 1.9 compatibility.

Why can’t I merge the packages directly if I have FEATURES=test (or USE=doc) enabled? Since we’re actually running tests, we’ve got to depend on the packages that are used by the tests themselves; to be on the safe side, when there are optional tests we try to push all of them to be run during the test phase. Unfortunately because of the nature of the gems themselves this way too often lead to circular dependencies; this is why I call it a dependency web rather than a dependency tree, nowadays.

When merging Ruby 1.9, it’s telling me that Rake and Rubygems are being dropped, why is that? Ruby 1.9 comes with a bundled copy of Rake and Rubygems, but those are quite older, and a number of packages required much newer versions of either, or both. We could allow for an override for those, but it would be much harder than simply removing them and adding the dependencies over the normal Rake and Rubygems that install for 1.9 as well. Actually, the same destiny is there for JRuby that bundles even more packages.

Portage is reporting that the package has been built without respecting LDFLAGS, why’s that? Well, while I’m the first person who’d want packages to always respect flags, Ruby packages are very tricky. Sometimes we can rewrite CFLAGS/LDFLAGS directly on the emake command line, but there are many cases where that is not possible, so the actual LDFLAGS used for rebuilding Ruby itself are used. This is one of the things that we could use some help, or some more dedicated time.

What about JRuby? You started so well on it and is now slowing down, why? Mostly because a number of issues upstream required newer JRuby versions altogether; then they fractured support in an even higher number of further packages with mixed Java and Ruby code. I’m no Java expert to begin with, and keeping up with the development is not something I have very much time to tackle. The other members of the team seems highly uninterested in JRuby so far, thus you get what there is time for.

The package #{fnord} is installing itself as a gem, but it’s using a GitHub tarball rather than the gem file, why? Once again it has to do with the fact that gems are not required to provide or execute tests; while a few gems feel safe enough to let users run their tests, there are many that simply do not provide tests with their gem files. To solve this, we go around it, and download the code directly from GitHub, and use that for the packaging. Thankfully, GitHub got their shit together and is now feasible for us to keep using their download service. That is, if the upstream developers actually bother tagging their releases. Sigh!

How can I help to get Ruby 1.9 support in? Again there is a dedicated blog post from me that can be summarized as “we need more testing of the packages, reporting bugs upstream where it make sense, and where it makes sense more tests .”

There are a number of issues that need to be tackled, you can find most on Bugzilla — simply open the page and look through the bugs. Make sure that the test failures get fixed upstream, and not simply blacklisted, if possible.

I need Ruby 1.9 available right now as I’m using it in production! Tough luck, because “right now” is definitely not going to happen. Soon, maybe; right now, no. As I said there are a number of things to kink out; on the other hand, you can either help as stated before, or find someone who can be paid to help. I’m up for hire — and I’m repeating this because I’m growing tired of the few people who actually make a living out of this expecting the four of us to abide to their agenda.

I’m using Funtoo and… Stop there! The Gentoo Ruby team does not and will never support Funtoo users. Daniel decided to go on with Ruby 1.9 at a point where neither the upstream projects nor the eclasses were designed to. Whatever he did after that, we have no intention to sort out.

More questions?

Ruby 1.9 vs Python 3

In my previous post where I declared myself up for hiring by those who really really want Ruby 1.9 sooner than we’re currently planning to release it, I’ve said that the Ruby team doesn’t want to “Pull a Python 3”. I guess that I should explain a bit what I meant just there.

Ruby 1.9 and Python 3 are, conceptually, actually similar: while Python 3 actually make a much wider change in syntax as well as behaviour, both requires explicit, often non-trivial, porting of the software to work. Thus, they both require you to be slotted, installed side-by-side, with the older, more commonly used alternative, and so do the libraries and programs.

There is more similitude between the way the two are handled than you’d expect, mostly because the Python support for that has been partly copied out of Ruby NG stripped of a few features. These features are, for the most part, what I’d say protect us from pulling a Python 3.

As it is, installation of Python 3-powered packages is done once Python 3 is installed; and Python 3 is installed, unless explicitly masked, on every system, stable or not, because of the way Portage resolves dependencies. In my case, I don’t care about having it around, so it’s masked on all my systems (minus the tinderbox, for obvious reasons). You cannot decide whether a given package is installed for 2.6, 2.7 or 3.1, and you can only keep around safely one Python for the 2.x series as it will only install for that — which is going to be fun, because 2.7 seem to break so many things.

Ruby packages instead is coordinated through the use of the RUBY_TARGETS variable, that allows us (and you) to choose for which implementation (if supported) install a given package; you can even tweak it package-per-package via package.use! This, actually, makes the maintenance burden quite higher on our side because we have to make sure that all the dependency tree is up-to-date with a given target, on the other hand though it allows us be sure that the packages are available, and it would scream at us if they weren’t (or rather Mr Bones would).

Most importantly, we don’t need no stinkin’ script like python-updater to add or remove an implementation; since the implementations are user-chosen via an USE-expanded variable (RUBY_TARGETS as I said), what you otherwise do with python-updater (or even perl-cleaner) is done through …. emerge -avuDN world.

Even though, I’ll admit, there is one thing that at least python-updater seems to take into consideration and that for now we can’t cater: using the Ruby interpreter rather than binding a library to be usable via Ruby; as I said in the post I linked above, it’s one of the few cases that needs to be kinked out still before it can be unmasked. Again you can either wait or hire somebody to do the dirty job for you.

A note about the “stinkin’ script” notion: one of the reason why I dislike the python-updater approach is that it lists a few “manual” packages to be rebuilt. The reason for that to happen is the old Python bug that caused packages to link the Python interpreter statically. The problem has since been fixed, but the list (which is very limited compared to what the tinderbox found at the time), is still present.

It is not all. I said at the start that right now Python 3 is installed unconditionally by default on all systems; we’re going to do double- and triple-work to make sure that the same won’t happen with Ruby 1.9 until we’re ready to switch the defaults. Switching the defaults will likely take a much longer time; we’re going to make 1.9 stable first, and start stabling packages supporting that… from there on, we’d be considering removing packages that are 1.8-only.

Well, to be honest, we’re going to consider switching some packages that won’t work with 1.9 (or JRuby) and neither have use nor they are maintained upstream. For good or bad, a lot of the packages in the tree have been added by the previous team members, and they, like us, often did so when they had a personal interest in the package… those packages often times are no longer maintained and are dead in the water, but we still carry them around.

Anyway, once again, the road is still bumpy, but it’s not impossible; I’m not sure if we can get to unmasking it before end of the summer as I was hoping to, but we’re definitely on track to provide a good user experience for Gentoo users who develop in Ruby, and most of the time, we can even provide a better upstream experience.

Derailing

I’m still not sure why I work with Rails, if it’s really for the quality of some projects I found, such as Typo, if it is because I hate most of the alternatives even more (PHP, Python/Turbogears/ToscaWidgets), or because I’m masochist.

After we all seemed to settle in with Rails 2.3.5 as the final iteration of the Rails 2 series, and were ready to face a huge absurd mess with Rails 3 once released, the project decided to drop a further bombshell on us in the form of Rails 2.3.6, and .7, and finally .8 that seemed to work more or less correctly. These updates weren’t simple bugfixes, because they actually went much further: they changed the supported version of Rack from 1.0.1 to 1.1.0 (it changes basically the whole internal engine of the framework!), the version if tmail and of i18n. It also changed the tzinfo version, but that’s almost pointless when considered in Gentoo since we actually use it unbundled and keep it up-to-date when new releases are made.

But most likely the biggest trouble with the new Rails version is implementing an anti-XSS interface compatible with Rails 3; this caused quite a stir because almost all Rails applications needed to be adapted for that to work. Typo for instance is still not compatible with 2.3.8, as far as I know! Rails-extensions and other Rails-tied libraries also had to be updated, and when we’ve been lucky enough, upstream kept them compatible with both the old and the new interfaces.

At any rate, my current job requires me to work with Rails and with some Rails extensions; and since I’m the kind of person who steps up to do something more even though it’s not paid for, I made sure I had ebuilds, that I could run test with, for all of them. This actually turned out more than once useful, and as it happens, today was another of those days when I’m glad I’m developing Gentoo.

The first problem appeared when it came time to update to the new (minor) version of oauth2 (required to implement proper Facebook-connected login with their changes last April); for ease of use with their Javascript framework, the new interface uses almost exclusively the JSON format; and in Ruby, there is no shortage of JSON interpreters. Indeed, beside the original implementation in ActiveSupport (part of Rails) there is the JSON gem, which provides both a “pure ruby” implementation and a compiled implementation (to be honest, the compiled, C-based implementation, was left broken in Gentoo for a while by myself, I’m sorry about that and as soon as I noticed I corrected it); then there is the one I already discussed briefly together with the problems related to Rails 2.3.8: yajl-ruby. Three is better than one, no? I beg you to differ!

To make it feasible to choose between the different implementations as wanted, the oauth2 developers created a new gem, multi_json, that allows to switch between the different implementations. No it doesn’t even try to give a single compatible exception interface, so don’t ask, please. It was time to pack the new gem then, but that caused a bit of trouble by itself: beside the (unfortunately usual) trouble with the Rakefile demanding RSpec to even just declare the targets (so also to build documentation) or the spec target having a dependency over the Jeweler-provided check_dependencies, the testsuite failed on both Ruby 1.9 and JRuby quite soon; the problem? It forced testing with all the supported JSON engines; but Ruby 1.9 lacks ActiveSupport (no, Rails 2.3.8 does not work with Ruby 1.9, stop asking), and JRuby lacks yajl-ruby since that’s a C-based extension. A few changes later and the testsuite reports pending tests when the tested engine is not found as it should have from the start. But a further problem appears in the form of oauth2 test failures: the JSON C-based extension gets identified and loaded but the wrong constant is used to load the engine, which results in multi_json to crap on itself. D’oh!

This was already reported on the GitHub page, on the other hand I resolved to fix it in a different way, possibly more complete; unfortunately I didn’t get it entirely right because the order in which the engines are tested is definitely important to upstream (and yes it seems to be more of a popularity contest than an actual technical behaviour contest, but nevermind that for now). Fixed that, another problem with oauth2 appeared, and that turned out to be caused by the ActiveSupport JSON parser; while the other parsers seems to validate the content they are provided with, AS follows the “Garbage-in, Garbage-out” idea: it does not give any exception if the content given is not JSON; this wouldn’t be so bad if it wasn’t that the oauth2 code actually relied on this to be able to choose between JSON and Form-Encoded parameters given. D’oh! One more fix.

Speaking about this, Headius if you’re reading me you should consider adding a pure-Java JSON parser to multi_json just for the sake of pissing off MRi/REE guys… and to provide a counter-test of a not-available engine.

At least, the problems between multi_json and oauth2 had the decency to happen only when non-standard setup were used (Gentoo is slightly non-standard because of auto_gem), so it’s not entirely upstream’s fault to not have noticed them. Besides, kudos to Michael (upstream) who already released the new gems while I was writing this!

Another gem that I should be using and was in need of a bump was facebooker (it’s in the ruby-overlay rather than in main tree); this one was already bumped recently for compatibility with Rails 2.3.8, but keeping itself compatible with the older release, luckily. Fixing the testsuite for the 1.0.70 version has been easy: it was forcing Rails 2.3.8 in the lack of multi_rails, but I wanted it to work with 2.3.5 as well, so I dropped that forcing. With the new release (1.0.71) the problem became worse because the tests started to fail. I was afraid the problem was in dropped 2.3.5 compatibility but that wasn’t the case.

First of all, I worked out properly the problem of forcing 2.3.8 version of Rails by making it abide to RAILS_VERSION during testing; this allows me not to edit the ebuild for each new version, as I just have to tinker with the environment variables to get them right. Then I proceeded with a (not too short) debugging session, which finally catapulted me to a change in the latest version. Since one function call now fails with Rack 1.1.0 for non-POST requests, the code was changed to ignore the other requests; the testsuite on the other hand tested with both POST and GET requests (which makes me assume that it should work in both cases). The funny part was that the (now failing) function above was only used to provide a parameter to another method (which then checked some further parameters)… but that particular argument was no longer used. So my fix was to remove the introduced exclusion for non-POST, remove the function call, and finally remove the argument. Voilà, the testsuite now passes all green.

All the fixes are in my github page and the various upstream developers are notified about them; hopefully soon new releases of all three will provide the same fixes to those of you who don’t use Gentoo for installing the gems (like, say, if you’re using Heroku to host your application). On the other hand, if you’re a Gentoo user and want to gloat about it, you can tell Rails developer working with OS X that if the Ruby libraries aren’t totally broken is also because we thoroughly test them, more than upstream does!

Always a better Ruby

To make Gentoo a much better platform for Ruby development, I’ve started working last year on the Ruby-NG eclasses which provide a way to install Ruby extensions for multiple Ruby implementations in parallel (leaving to the user the choice for what to install them — unlike Python). I say “eclasses” because one is general and another is used to install RubyGems-based packages, with “fake” specifications that sidestep problematic dependencies and other similar issues.

Now, when I started implementing this, my idea was to add support for Ruby 1.9 and JRuby (both of which were missing before), but the result was suitable for Ruby Enterprise as well, which Alex has been working on lately. The end result is that, for standalone Ruby extensions, the eclasses were well received, and more than half the tree now uses the new eclasses:

Update (2016-04-29): This used to include live graphs, but these graphs are now lost, I’m sorry.

What we haven’t yet experimented too much with is using the new eclasses to support bindings that are part of bigger packages, like obexftp which is still broken. I guess is another reason why you should split foreign language bindings rather than keep them monolithically inside the single package. I was talking about this with Hans today and I think this is one of the things we should work on soon, if we want to deprecate the old eclasses.

As it is, just a handful of simple Ruby extensions are missing to be migrated before we can “safely” unmask Ruby 1.9 (I say “safely” because I expect the tinderbox to go crazy once Ruby 1.9 is unmasked and selected, but that’s beside the point now).

Now, back to Ruby 1.8 and Enterprise. Since I had to fix the two of them for BerkDB 5.0 I decided to backport the patch I made for Ruby 1.9 to enable --no-undefined when linking extensions. Interestingly enough, this shown up the problem (already fixed by Alex) with the OpenSSL bindings in the upstream package — remind you of something?

Enabling the --no-undefined flag on all the Ruby versions available, means that we can be sure that the extensions built will work as intended on all of them, and that a patch from one version won’t break it on another. Well, it does not give us 100% safety, but it at least increases it. Without this change, adding a call to a newly-introduced function could produce a non-working extension without warning, but for an abort at runtime.

Unfortunately this does not happen without consequences and false positives; ruby-gstreamer is an example of this: it fails because of the undefined symbols in the extension; the extension is not broken (but the ebuild is), it simply needs another extension to provide those symbols before it is loaded. I think this is a very rare situation and I’d rather deal with this on a case-by-case basis rathe than leave all the undefined references as “fine” — I said that the ebuild is broken; the problem is that the extension needs ruby-glib at runtime and we currently don’t depend on it at all.

The next steps are obviously to run the tinderbox with all the Ruby implementations enabled and see how it works out, so that maybe we can improve the lines on this graph:

Update (2016-04-29): This used to include live graphs, but these graphs are now lost, I’m sorry.

To improve this situations, I tried to solve the test-unit problem. Ruby 1.9 ships with a reduced test-unit implementation (which is what is also available as minitest in Gentoo, for 1.8, JRuby and EE); since most testsuite need the full-blown test-unit interface, there is a test-unit gem to provide it for Ruby 1.9. It’s not entirely API compatible, but it comes very near to that. After this, another implementation was created, test-unit-2, which is even less API compatible but provides enhanced features, and works on (almost) all implementation – it fails on JRuby maybe for a JRuby bug.

Unfortunately, auto-gem loading causes test-unit-2 from loading on all the implementations, if installed, which is the reason why we’re keeping it masked. While I still haven’t found a proper solution to deal with this; the best choice I can see now is just depend on the 1.x series of test-unit (only available for Ruby 1.9) by default; depend on test-unit-2 if the package needs it; and block test-unit-2 if the package fails tests with it installed. This should allow to cover most of the needs of our users.

Finally, a request if somebody feels like playing a bit around with Unix commands, to improve the way we currently install the Ruby-NG based ebuilds. Since we install for up to four targets at the same time, most of the time we install multiple copies of the same files. They can easily become a problem. While I know there is a (very incomplete) work for btrfs to support live data de-duplication, it would be very nice if we could, at some point, reduce the waste due to this, without relying on the filesystem.

I’m afraid I have no knowledge on how to do that, but if we could just run some pass of software after the install is complete (we can easily hook stuff like that up in the eclass) we could then use hardlinks between the files that are identical rather than having to install them multiple times.

Anyway, this is enough for now, news will follow, and please let us know if an extension that “worked” before now fails to build for undefined symbols… we’ll have to deal with them, one way or the other!

Ruby-NG: The Ruby’s eye (or, sometimes it’s a positive day)

I have ranted and ranted and ranted about Ruby packages not being good enough for packaging, I also have ranted about upstream developers not even getting their own testsuite cleared up, or being difficult to work with. I have complained about GitHub because of the way it allows to “fork” packages too easily. I’m not going to retract those notes, but… sometimes things do turn out pretty well.

In the past days I’ve been working toward adding Rudy in tree — as I don’t want to keep on building my own slow, hard, and boring scripts to deal with EC2, and I’m spending more time understanding how to get EC2 working than writing the code I’m paid to write. As I wrote before, this is another of those compound projects that is split in a high number of small projects (some literally one source files per gem!). It worried me to begin with, but on the other hand, the result is altogether not bad.

Not only the first fixes I had to apply to amazon-ec2 were applied, and a new version released, the very night I sent them upstream (and added them to Gentoo, now gone already), but also Delano (author of Rudy – and thus of lots of its dependencies) applied quickly most of my changes to get rid of the mandatory requirement for hanna, even on some packages I didn’t send them for yet, and released them again. Of course the job is far from finished, as I haven’t reached Rudy itself yet, but the outcome start to look much nicer.

I also have good words for GitHub right now: since it makes it very easy and quick to take the code from another project, patch it up and send it to the original author to be merged (and re-released hopefully). This also works fine with patches coming from other contributors, like Thomas Enebo from JRuby who sent me a fix (or “workaround” if you prefer, but it’s still a way to achieve the wanted result in a compatible way) to make newer matchy work properly with JRuby. On the whole, I have to say I’m getting quite positive about GitHub, but I’d very much like they allowed me to reply to the messages I receive by mail, rather than having to log-in on the system. I positively hate multiple mail systems, Facebook’s as well as GitHub’s, as well as most forums’.

And for the shameless plug and trivia time, I have more repositories in my GitHub page than items in my wishlist…

Anyway, back to work now!

Ruby-NG: Bin Man (or, the binwrapper problems)

One of the problems that we definitely need to hash out before we start marking as stable the ebuilds based on the new Ruby eclasses is the handling of the current “binwrappers”. I dislike the name sincerely — while they are obviously in the bin directory for the gems, they are definitely not binaries, but rather executable scripts. Sigh, let it be for now though.

RubyGems already creates a wrapper by itself, so that it calls the correct (latest) binary for a given gem. On the other hand we don’t use that wrapper, but a different one that can be generated by the ebuild with much more stable targets. The end result is generally pleasing, as we can use the same wrapper for any implementation the gem is installed for. But here start the trouble.

Right now, this works all fine only if the gem is installed for every implementation, or at least every installed implementation. This again is mostly correct for most users as they will only have Ruby 1.8 installed. It starts being a bit different for JRuby, as not all of those scripts can be launched through that (but on the other hand, since we cannot set JRuby up as default Ruby implementation with eselect, it shouldn’t be much of a problem). It will be come a problem when we’re going to have Ruby 1.9 fully supported in Gentoo, as setting it up as default Ruby provider for the system will cause most of the scripts, installed only for Ruby 1.8, to fail.

The problem described above is to be intended when a package lacks an implementation, but conversely the problem applies when a package is available only for an implementation. Take for instance the (for now unpackaged) Duby — a strongly-typed Ruby-inspired scripting language developed by JRuby developer Charles Nutter. It will only ever be available for JRuby (minus possible reimplementations) as it generates Java Bytecode that JRuby can execute. It also has a duby script, but the ebuild I have here installs a broken wrapper: it calls into /usr/bin/ruby, but of course that cannot ever be JRuby, problems ensures.

Another problem is what Hans tried to solve some time ago: when multiple slots of the same gem are installed, and they all install the same named commands, how do you choose between them? Most of the times, you install them slotted, so you got cmd-${SLOT} named commands around, but you also need to have a way to just call cmd and have it work. Hans worked on eselect-gem for that reason: it’s a generic approach to the same thing that eselect-rails does. Right now, we’re not integrating well with that, so we might need to find a way to handle that.

One of the reasons why I’m now writing all this about the wrappers, is that I’d love for people to comment (after looking at the implementation, possibly, as I’d seriously love to avoid noise due to users wishing ponies, or detractors just saying that RubyGems is perfect — it’s not), with possible approaches we can take. So, comments welcome! And you might want to use the pre HTMLish tag to submit code via the comments, so that it won’t be screwed up by the formatting. You can also use at-symbols for inline code keywords, like I’ve done in the post.