Test comprehensiveness versus replicability — A Ruby Rant

I guess that “A Ruby Rant” could become a regular column on this blog, given how many of my posts over time has been “Ruby Rants”, but let’s not dig further.

I’m trying my best to package the new dependencies introduced with Radiant 1.0.0rc4 so that we can update the package in Portage (given that Radiant 0.9 is in tree already). This is proving quite difficult; even though Radiant upstream helped me out by replacing the old dependency on highline with a modern one after the issue was fixed upstream, there are a few new gems that require hours and hours of work to package.

The first big issue comes with cocaine — this gem is developed by ThoughtBot, which are the Rubygems.org designers, one would expect highly professional development from them, but that’s by far not the case. The gem requires another ThoughtBot-developed library for testing, bourne which in turn requires mocha, but not any mocha, up to yesterday it required strictly version 0.9.5; now it requires strictly version 0.10.4, which is an improvement but still not kosher. Why this happens?

@flameeyes It hooks deep into mocha and breaks on internal changes. We’d love an internal api for mocha, but there isn’t one.

So here you can see one huge issue with Ruby development: since it’s very hard to actually make internal interfaces internal, due to monkey-patching and scope games, people will end up relying on things that they shouldn’t be relying upon. And they pretend that’s a good way to solve the issue because there is not a better way. Damn.

Okay strike one, I have put cocaine aside for now (and that was a good idea, seeing how prodding them about it have gotten them to at least update bourne’s dependency on a version that is not quite as ancient), and worked on another dependency: compass which is yet another CSS framework.

Testing this particular package has proven quite difficult, because it has a long list of extra dependencies that you won’t find listed in the gemspec at all, but just on the Gemfile. This list among others include compass-validator which is a gem only ever required by compass, which requires compass on its own (without listing it in the gemspec)… even if this sounds fishy enough it doesn’t cover how fishy that gem is.

The compass-validator gem bundles a series of prebuilt Java libraries up to W3C’s CSS Validator and executes Java at runtime. No, this has nothing to do with JRuby, and even when using JRuby it wouldn’t load the Java classes but still execute Java. Thanks to Krzysztof at least I’ve been able to get the gem to not bundle so much Java code; instead it relies on dev-java/css-validator and java-config to call into the right commands. Still nasty but at least usable.

But this is not yet a nasty problem: the Gemfile lists autotest which is a dummy gem for ZenTest instead (so why not list ZenTest directly? bah!), and then proceeds with two different ways to wrap around gems specific for OS X: autotest-fsevent is under a RUBY_PLATFORM check, while rb-fsevent is under a group. The problem is that Bundler always install all the groups, despite not loading them if you exclude them.

Okay so the gem has some specific codepaths for OS X, does that matter much? Probably not as long as you can still ignore them and run them on Linux. But there is more trouble ahead when I reach the terrific livereload gem, which is described this way:

LiveReload is a Safari/Chrome extension + a command-line tool that: 1. Applies CSS and JavaScript file changes without reloading a page. 2. Automatically reloads a page when any other file changes (html, image, server-side script, etc).

Okay so it requires Chrome at test-time.. not so surprising as another of the gems I use (best_in_place) indirectly uses selenium-webdriver, which uses Firefox. The problem is that when you go to the source repository for that gem, you’re told that it’s deprecated (compass is still actively developed!) in favour of … a graphical Mac/Windows application.

So okay I’m all to cover as much as possible with tests, but how does that help if you make it impossible for anyone else but you to actually run the tests because you tie them to one specific platform, and one that is very unlikely to be used as production server?!


Working outside the bubble

One of the most common mistakes a developer can make is never look or work outside their usual environment. By never looking outside their own little sandbox, they risk losing sight of which improvements happen outside of their world, which they could easily factor in. This is why I always look at what Fedora, Debian and Ubuntu do, for instance.

Given my involvement with Ruby packaging, one of the things I should try to avoid covering myself with is Ruby itself; unfortunately trying to look at Python packaging is not something I’d be keen on doing anytime soon, given the state of python.eclass (please guys, rewrite it, sooner rather than later!). But at least tonight I spent some time looking at Perl modules’ ebuilds.

The main reason why I went to look for those is because an user (Grant) asked me to add Google’s AdWords library for Perl to the main tree, but Perl is something I wanted to look at for a while, since I wanted to set up RT for my customers and the new versions require a few more dependencies.

At any rate, looking at Perl ebuilds for me is not too unnatural: while the fact that there is a single Perl implementation makes it much easier on them to implement the phases, the rest of the set up is not too different from what we have in Ruby land.

What seems to be more common there is that they also set HOMEPAGE by default if none is set, since it seems like a lot of modules only have an homepage as part of CPAN, which is only different in the Ruby world due to the fact that most of the projects are managed on GitHub, which makes that their default homepage.

Having finally to take a good look at the g-cpan tool, I have to say that I think that all the trials for a g-gem tool or similar are quite out of target: instead of creating a tool that creates and install ebuilds for the pre-packaged extensions, we should have coordinated with RubyGems upstream to provide a few more details — such as a way to explicit the license the extension is released under, which CPAN does and RubyGems doesn’t, you can also see Mandriva struggling with the same issue in the final questions on my FOSDEM talk — and at that point we could have just created a tool that prepared a skeleton for our own ebuilds, rather than something fully automated like it has been tried before.

Anyway, I really like the idea of trying to package something you’re not usually used to, makes it easy to find what is different and what is similar, in approaches.

Gems make it a battle between the developer and the packager

It is definitely not a coincidence that whenever I have to dive into Gentoo Ruby packaging I end up writing a long series of articles for my blog that should have the tag “Rant” attached, until I end up deciding that it’s not worth it and I should rather do something else.

The problem is that, as I said many times before (and I guess the Debian Ruby team agrees as well), the whole design of RubyGems makes it very difficult to package them properly, and at the same time provides the developers with enough concepts to make the packaging even more tricky than it would by merely due tot he format.

As the title says, for one reason or another, RubyGems’s main accomplishment is simply to put extensions’ developers and distributions’ packages one against the other, with the former group insisting on doing things “fun”, and the latter doing things “right’. I guess most of the members of the former group also never tried managing a long term deployment of their application outside of things like Heroku (that are paid to take care of that).

And before somebody tells me I’m being mean by painting the developers puny with their concept of fun, it’s not my fault if in the space of an hour after tweeting a shorter version of the first paragraph of this post, two people told me that “development is fun”… I’m afraid for most people that’s what matters, it being fun, not reliable or solid…

At any rate… even though as we speak nobody expressed interest (via flattr) on packaging of the Ruby MongoDB driver that I posted about yesterday, I started looking into it (mostly because I’m doing another computer recovery for a customer and thus I had some free time in my hands while I waited for antivirus to complete, dd_rescue to copy data over, and so on so forth).

I was able to get some basic gems for bson and mongo working, which were part of the hydra repository I noted, but the problems started when I looked into plucky which is the “thin layer” used by the actual ORM. It is not surprising that this gem also is “neutered” to the point of being useless for Gentoo packaging requirements, but there are more issues. First of all it required one more totally new gem to be packaged – log_buddy which also required some fixes – that is not listed in the RubyGems website (which is proper, if you consider that the tests are not executable from the gem file), but most importantly, it relied on the matchy gem.

This is something I already had to deal with, as it was in another long list of dependencies last year or the one before (I honestly forgot). This gem is interesting: while the package is dev-ruby/matchy, it was only available as a person-specific gem in Gemcutter: jnunemaker-matchy and mcmire-matchy; the former is the original (0.4.0), while the latter is a fork that fixed a few issues, among which there was the main problem: jnunemaker-matchy is available neither as a tarball nor as a git tag.

For the package that originally required matchy for us (dev-ruby/crack), mcmire’s fork worked quite well, and indeed it was just a matter of telling it to use the other gem for it to work. That’s not the case for plucky, even thought jnunemaker didn’t release any version of matchy in two years, it only works with his version of matchy. Which meant packaging that one as well, for now.

Did I tell you that mcmire’s version works with Ruby 1.9, while jnunemaker’s doesn’t? No? Well, I’m telling you now. Just so you know, almost in 2012, this is a big deal.

And no, there is not a 0.4.0 yet. Two years after release. The code stagnated since then.

Oh and plucky’s tests will fail depending on how Ruby decides to sort an Hash’s keys array. Array comparison in Ruby is (obviously) ordered.

Then you look at the actual mongo_mapper gem that was the leaf of the whole tree.. and you find out that running the tests without bundler fixing the dependencies is actually impossible (due to the three versions of i18n that we have to allow side-installation of). And the Gemfile, while never declaring dependencies on the official Mongo driver (it gets it through plucky), looks for bson_ext (the compiled C extension, that in Gentoo was not going to exist, since it’s actually installed by the same bson package — I’ll have to create a fake gemspec for it just so it can be satisfied).

And this actually brings us to a different problem as well: even though plucky has been updated (to version 0.4.3) in November, it still requires series 1.3 of the Mongo driver. Version 1.4.0 was released in September, and we’re at version 1.5.2.

And I didn’t name SystemTimer gem, which is declared a requirement during development (but not by the gem of course, since you’re not supposed to run tests there) only for Ruby 1.8 (actually only for mri18, what about Ruby EE?) which lacks an indication of a homepage in the RubyGems website….

I love Ruby. I hate its development.

Changes to the netatalk ebuild

You might remember a few weeks ago I bought a new system to use as a local fileserver, keeping among other things a complete copy of the distfiles mirrors (and more than that) to use with the tinderbox and the other boxes. But that system is not only limited to serve (via NFS) the distfiles for my systems, it’s also set up to serve as main storage point for customers’ data as well as my own data, including my music library (which is primarily managed by iTunes, but is consumed by Banshee as well) — before this, a rsync task copied the iTunes library over Yamato to then serve it to yet another box.. you probably see how duplicating this data is not something I’m happy about.

Well, turns out that Samba can’t get me anything over 50Mbit of bandwidth over a gigabit network, when copying over the data, not nice (and this is without considering that each time I copy over files with Windows 7 I have to answer “Try again” as it reports a network-busy error…). So I decided to try again Netatalk, that I used to use a long time ago, and the ebuild of which I have partly wrote myself in the past — the result has been satisfying: 200Mbit, which is still not gigabit, but it’s still four times as fast as Samba.

Now, since my two OS X systems (okay, three with my mother’s) are all running Lion, what I needed was the 2.2 series (2.2.1 release) with Avahi support, not a problem, just need the ~arch version for now. But after using it for a week or so, I started seeing a number of issues that needed to be addressed. The most obvious of which was that restarting Avahi didn’t cause the AFP daemon to restart, which in turn meant that OS X was unable to connect to the fileserver after reboot. D’oh.

The Netatalk ebuild seems to have been maintained on life support for a while, and simply bumped without checking out all of its internals, so I decided to give it a look. The cause of the issue I hit myself was obvious: upstream-provided init script wasn’t designed to be well-integrated into Gentoo. This is common: upstream projects provide some init scripts that do work with Gentoo but don’t follow our guidelines, which would be especially difficult since we do lack good documentation on how to properly write init scripts for the Gentoo init system.

Thankfully, after an afternoon toying with the init scripts and the ebuild, I was able to get a new revision of the ebuild in tree that, while causing a few changes in behaviour, should be much easier to deal with in the future. Since there isn’t an official documentation for Netatalk in Gentoo, it should be a good idea to document here the changes, so that if somebody is confused by the new ebuild, or has comments on what I have done, it can be used as a reference.

The most noticeable difference from the old design is that the new ebuild installs split init scripts for the services, rather than using the single /etc/init.d/netatalk script. This is important for two main reasons: it no longer risk leaving daemons running if they were enabled earlier and stopped later, and it allows to check out the service’s status daemon per daemon.

In the default configuration (the details of which I’ll get to in a moment), netatalk ebuild installs two services: afpd and cnid_metad; the former is the actual file server daemon, the latter is its backend: it provides the CNID metadata for the volumes, and is basically a huge database process. Having the two separate is handy: you no longer need to reload the database when changing the configuration file, which could be a waste of time if you have huge volumes, or volumes with a huge number of files. And most importantly, only afpd talks to the network and needs avahi support, anything else is, well, backend.

The new init scripts don’t rely on the old /etc/netatalk/netatalk.conf configuration file; this happens by design, as the services don’t really share many settings: most of the variables were used to tell which services to start, and the rest were mostly used to pass custom options to the daemons themselves. In the case of afpd almost all the settings passed were configurable in the afpd.conf setting file as well, and the end result is that using that configuration is the suggested method, instead of passing the options via the init script.

The only option that is shared across different services is related to the AppleTalk protocol, which is what I’ll be talking about now. While most users of Netatalk will likely only need the AFP service, which is the file server itself, the package implements also some other services to be used in AppleTalk-based networks. For those who don’t know the details of this, AppleTalk was Apple’s own local network protocol, and was, and sometimes is, used with Ethernet networks with a “wrapper” called EtherTalk.

Apple discontinued support for this technology with the Snow Leopard (10.6) release; and even Leopard itself preferred TCP/IP over AppleTalk, whenever possible, so the uses of that protocol (or to be precise, of DDP) are pretty rare. One of my customers still have a Linux box configured with EtherTalk, because their Mutoh large format printer was configured using that.. but even that’s going to change at some point in the future. With this in mind, there is the other noticeable difference from before.

While netatalk before installed everything for AppleTalk users by default, now there is an appletalk USE flag that needs to be enabled: without that the atalkd service is not installed, nor are the a2boot, timelord and papd services, that are used, respectively, for network boot, time synchronisation and print server hosting (after all, CUPS is now developed by Apple itself). This is another good reason why the services are now split: those three services are only installed if you want AppleTalk support, and they all depend on the atalkd service; afpd on the other hand will use atalkd only if you configure it to do so, which is not the default for most systems.

But I said that AppleTalk users had shared options between services, and that is true: the name and zone for the atalkd service need to be configured; this is now done through /etc/conf.d/atalkd (and you have to follow the same setting with the extra parameters for afpd if you want that to run over AppleTalk). By default it’ll export the host’s short name as the AppleTalk host name, and will use no zone, and that should do the trick for almost every user out there who is still stuck with AppleTalk, for what I can tell.

And to complete the discussion for what concerns AppleTalk, there is one catch: the Netatalk package needs kernel help to support the protocol stack; this means that it should check the kernel configuration and give you hints on what to enable and why. Unfortunately I have no real idea of which settings need to be enabled, so I didn’t add any check for that right now. If you wish to send a patch to do so, it’ll be very welcome.

But not all changes are related to the services: new USE flags were introduced to deal with access control lists (ACLs) and user space quota, and the old XFS USE flag has been removed: it made sense back in the early days of kernel 2.6 as not everybody had the same set of Linux kernel headers in the system, but nowadays, that is only legacy, so support for XFS-style quota is enabled whenever Quotas are enabled as well, just like it was supposed to. Very basic support for LDAP sharing of users and group is also present now, but like for Kerberos, it’ll need a complex testing network to actually work as it was intended, so if you notice anything wrong, please report. And if you only need AFP to work with modern OS X installation, feel free to disable the ssl USE flag, as that one only adds the old, DHX 1 user access module (UAM), which is replaced by the libgcrypt-based DHX2 for modern systems.

The change I’m probably most happy about, though, is the replacement of the statically-linked libatalk library with a shared object, that is used both by the utility and as an exported interface (I’m not sure if it should have been exported, but right now it is, so there..). This allows to cut the size of the package from over 8MB down to 4MB, with debug information included, which means also a smaller memory footprint when you have more than one service started (and you always have at least two, possibly three, at any time).

Unfortunately this one has brought at least one issue, which is now fixed: since static linking is non-transitive but also non-asneeded-influenced, external library dependencies were not expressed on the common library, but rather on the tools themselves; this was a problem now, so for a little while the tcpd USE flag couldn’t produce a final link… I fixed that one, but there might be other issues with other combination of USE flags, but I guess this is why we call it “testing”…

Anyhow… if you have suggestions to provide or bugs to report, please don’t refrain from leaving a comment here or opening a bug in our bugzilla — for a while at least, I’ll be the netatalk dedicated maintainer.. again… Next week’s tasks include sending the changes upstream so that we won’t need to keep patching this forever.

On releasing Ruby software

You probably know that I’ve been working hard on my Ruby-Elf software and its tools, which include my pride elfgrep and are now available in the main Portage tree so that it’s just an emerge ruby-elf away. To make it easier to install, manage and use, I wanted to make the package as much in line with Ruby packaging best practices taking into consideration both those installing it as a gem and those installing it with package managers such as Portage. This gave me a few more insights on packaging that before escaped me a lot.

First of all, thankfully, RubyGems packaging starts to be feasible without needing a bunch of third party software; whereas a lot of software used to require Hoe or Echoe to even run tests, some of it is reeling back, and using simply the standard Gem-provided Rake task to run packaging; this is also the road I decided to take with Ruby-Elf. Unfortunately Gentoo is once again late on the Rubygems game, as we still have 1.3.7 and not 1.5.0 used; this is partly because we’ve been hitting our own roadblocks with the upgrade of Ruby 1.9, which is really proving a pain in our collective … backside — you’d expect that in early 2011 all the main Ruby packages would work with the 1.9.2 release just fine, but that’s still not the case.

Integrating Rubyforge upload, though, is quite difficult because the Rubyforge extension itself is quite broken and no longer works out of the box — main problem being that it tries to use the “Any” specification for CPU, but that exists no more, replaced by “Other”; you can trick it into using that by changing the automated configuration, but it’s not a completely foolproof system. The whole extension seem pretty much outdated and written hastily (if there is a problem when creating the release slots or uploading the file, the state of the release is left halfway through).

For what concerns mediating between keeping a simple RubyGems packaging and still providing all the details needed for distributions’ packaging, while not requiring all the users to install the required development packages, I’ve decided to release two very different packages. The RubyGem only installs the code, the tools, and the man pages; it lacks the tests, because there is a lot of test data that would otherwise be installed without any need for it. The tarball on the other hand contains all the data from the git repository, but including the gemspec file (that is needed for instance in Gentoo to have fakegems install properly). In both cases, there are two type of files that are included in the two distributions but are not part of the git repositories: the man pages and the Ragel-generated demanglers (which I’m afraid I’ll soon have to drop and replace with manually-written ones, as Ragel is unsuitable for totally recursive patterns like the C++ mangling format used by GCC3 and specified by the IA64 ABI); by distributing these directly, users are not required to have either Ragel or libxslt installed to make full use of Ruby-Elf!

Speaking about the man pages; I love the many tricks I can make use of with DocBook and XSLT; I don’t have to reproduce the same text over and over when the options, or bugs, are the same for all the tools – I have a common library to implement them – I just need to include the common file, and use XPointer to tell it which part of the file to pick up. Also, it’s quite important to me to keep the man pages updated, since i took a page out of the git book: rather than implementing the --help option with a custom description of them, the --help option calls up the manpage of the tool. This works out pretty well, mostly because this particular gem is designed to work on Unix systems, so that the man tool is always going to be present. Unfortunately in the first release I made it didn’t work out all too well, as I didn’t consider the proper installation layout of the gem; this is now fixed and works perfectly even if you use gem install ruby-elf.

The one problem I still have is that I have not yet signed the packages themselves; the reason is actually quite simple: while it’s trivial with OpenSSH to proxy the ssh-agent connection, so that I can access private hosts when jumping from my frontend system to Yamato, I can’t find currently any way to proxy the GnuPG agent, which is needed for me to sign the packages; sure I could simply connect another smartcard reader to Yamato and move the card there to do the signing, but I’m not tremendously happy with such a solution. I think I’ll be writing some kind of script to do that; it shouldn’t be very difficult to do with ssh and nc6.

Hopefully, having now released my first very much Ruby package, and my first Gem, I hope to be able to do a better job at packaging, and fixing others’ packages, in Gentoo.

Maintaining backports with GIT

I have written last week of the good feeling of merging patches upstream – even though since then I don’t think I got anything else merged … well, beside the bti fixes that I sent Greg – this week, let’s start with the opposite problem: how can you handle backports sanely, and have a quick way to check what was merged upstream? Well, the answer, at least for the software that is managed upstream with GIT, is quite easy to me.

Note: yes this is a more comprehensive rehashing of what I posted last December so if you’ve been following my blog for a long time you might not be extremely surprised by the content.

So let’s start with two ideas: branches and tags; for my system to work out properly, you need upstream to have tagged their releases properly; so if the foobar project just released version 1.2.3, we need to have a tag available that is called foobar-1.2.3, v1.2.3, or something along these lines. From that, we’ll start out a new “scratch branch”; it is important to note that it’s a scratch branch, because it means that it can be force-pushed and might require a complete new checkout to work properly. So we have something like the following:

% git clone git://git.foobar.floss/foobar.git
% cd foobar
% git checkout -b 1.2.3-gentoo origin/v1.2.3

This gives us the 1.2.3-gentoo branch as the scratch branch, and we’ll see how that behave in a moment. If upstream fails to provide tags you can also try to track down which exact commit a release corresponds to – it is tricky but not unfeasible – and replace origin/v1.2.3 with the actual SHA hash of the commit or, even better as you’ll guess by the end of the post, tag it yourself.

The idea of using a scratch branch, rather than an actual “gentoo branch” is mostly out of simplicity to me; most of the time, I make more than a couple of changes to a project if I’m packaging it – mostly because I find it easier to just fix possible autotools minor issues before they actually spread throughout the package and other packages as well – but just the actual fixes I want to apply to the packaged version; cleanups, improvements and optimisations I send upstream and wait for the next release. I didn’t always do it this way, I admit.. I changed my opinion when I started maintaining too many packages to follow all of them individually. For this reason I usually have either a personal or a “gentoo” branch where I make changes to apply to master branch, which get sent upstream and merged, and a scratch branch to handle patches. It also makes it no different to add a custom patch or a backport to a specific version (do note, I’ll try to use the word “backport” whenever possible to stress the important of getting the stuff merged upstream so that it will be present in the future, hopefully).

So we know that in the upstream repository there have been a few commits to fix corner case crashers that, incidentally, seem to always apply on Gentoo (don’t laugh, it happens more often than you can think). The commits have the shorthashes 1111111 2222222 3333333 — I have no fantasy for hashes, so sue me.

% git cherry-pick 1111111
% git cherry-pick 2222222
% git cherry-pick 3333333

Now you have a branch with three commits, cherry-picked copies (with different hashes) of the commits you need. At this point, what I usually do, is tagging the current state (and in a few paragraphs you’ll understand why), so that we can get the data out properly; at this point, the way you name the tag depends vastly on how you will release the backport, so I’ll get to that right away.

The most common way to apply patches in Gentoo, for good or bad, is adding them to the files/ subdirectory of a package; to be honest this is my least preferred way unless they are really trivial stuff, because it means that the patches will be sent down the mirrors to all users, no matter whether they use the software or not; also, given the fact that you can use GIT for patch storage and versioning, it’s also duplicating the effort. With GIT-stored patches, it’s usually the easiest to create a files/${PV}/ subdirectory and store there the patches as exported by git format-patch — easy, yes; nice nope: given that, as I’ll say, you’ll be picking the patches again when a new version is released, they’ll always have different hashes, and thus the files will always differ, even if the patch itself is the same patch. This not only wastes time, it makes it non-deduplicable and also gets around the duplicated-files check. D’oh!

A more intelligent way to handle these trivial patches is to use a single, combined patch; while patchutils has a way to combine patches, it’s not really smart; on the other hand, GIT, like most other source control managers, can provide you with diffs between arbitrary points in the repository’s history… you can thus use git diff to export a combined, complete patch in a single file (though lacking history, attribution and explanation). This helps quite a lot when you have a few, or a number, of very small patches, one or two hunks each, that would cause too much overhead in the tree. Combining this way bigger patches can also work, but you’re more likely to compress it and upload it to the mirrors, or to some storage area and add it to SRC_URI.

A third alternative, which is also requiring you to have a storage area for extra distfiles, is using a so-called “patchset tarball”; as a lot of packages already do. The downside of this is that if you have a release without any patch tarball at all, it becomes less trivial to deal with it. At any rate, you can just put in a compressed tar archive the files created, once again, by git format-patch; if you add them as a subdirectory such as patches/ you can then use the epatch function from eutils.eclass to apply them sequentially, simply pointing it at the directory. You can then use the EPATCH_EXCLUDE variable to remove one patch without re-rolling the entire tarball.

Note: epatch itself was designed to use a slightly different patchset tarball format, that included the use of a specification of the architecture, or all to apply to all architectures. This was mostly because its first users were the toolchain-related packages, where architecture-dependent patches are very common. On the other hand, using conditional patches is usually discouraged, and mostly frown upon, for the rest of the software. Reason being that’s quite more likely to make a mistake when conditionality is involved; and that’s nothing new since it was the topic of an article I wrote over five years ago.

If you export the patches as multiple files in filesdir/, you’re not really going to have to think much about naming the tag; for both other cases you have multiple options: tie the name to the ebuild release, tie it to the CVS revision indication, and so on. My personal preferred choice is that of using a single incremental, non-version-specific number for patch tarballs and patches, and mix that with the upstream release version in the tag; in the example above, it would be 1.2.3-gentoo+1. This is, though, just a personal preference.

The reason is simple to explain and I hope it makes sense for others than me; if you tie it to the release of the ebuild (i.e. ${PF}), like the Ruby team did before, you end up in trouble when you want to add a build-change-only patch – take for instance the Berkeley DB 5.0 patch; it doesn’t change what is already installed on a system built with 4.8; it only allows to build anew with 5.0; given that, bumping the release in tree is going to waste users’ time – while using the CVS revision will create quite a few jumps (if you use the revision of the ebuild, that is) as many times you change the ebuild without changing the patches. Removing the indication of the upstream version is also useful, albeit rarely, when upstream does not merge any of your patches, and you could simply reuse the same patchset tarball as previous release; it’s something that comes handy especially when security releases are done.

At this point, as a summary you can do something like this:

  • mkdir patches; pushd patches; git format-patch v1.2.3..; popd; tar jcf foobar-gentoo-1.tar.bz2 patches — gets you a patchset tarball with the patches (similarly you can prepare split patches to run add to the tree);
  • git diff v1.2.3.. > foobar-gentoo-1.patch — creates the complete patch that you can either compress, or upload to mirrors or (if very very little) put it on the tree.

Now, let’s say upstream releases version 1.2.4, and integrates one of our patches. Redoing the patches is quick with GIT as well.

% git checkout -b 1.2.4-gentoo
% git rebase v1.2.4

If there are compatible changes, the new patches will be applied just fine, and updated to not apply with fuzz; any patch that was applied already will count as “empty” and will be simply removed from the branch. At that point, you can just reiterate the export as said above.

When pushing to the repository, remember to push explicitly the various gentoo branches, and make sure to push --tags as well. If you’re a Gentoo developer, you can host such repository on git.overlays.gentoo.org (I host a few of them already; lxc, libvirt, quagga …); probably contributors, even not developers, can ask for similar repositories to be hosted there.

I hope this can help out other developers dealing with GIT-bound upstreams to ease their overweight.

Really want Ruby 1.9 generally available? Read on.

Gentoo currently does not offer Ruby 1.9 available to users directly; there are a number of reasons for that, and can be summed up in what Alex described as “not pulling a Python 3 on our users”. Right now, there are near to no packages that need Ruby 1.9, and a lot that does not even work with it. While a minority nowadays, a few won’t even work if it’s installed together with 1.8, let alone configured as primary provider for Ruby.

Me, Alex and Hans have been working for a long time to find a solution, and since last year the definite solution seems to be Ruby NG which I originally started in May 2009 after having trouble with keeping this very blog alive on the previous vserver — which nowadays only hosts the xine bugzilla .

The road has been still uphill from there, as the three pages of posts tagged with RubyNG on this blog can document; trouble with the ideas and implementations, compatibility problems, a huge web of dependencies between packages, various fixes, all of it makes the road to Ruby 1.9 quite difficult for us packagers. At the same time, we’ve been doing our best to ensure that what the users are given with proper software, of good quality. Maybe it’s because I’m deeply involved with QA, maybe it is because I’m not writing production software daily, but I still think that we shouldn’t be providing with half-assed software easily, just for the sake of it.

That means that most of the time we either don’t add support for Ruby 1.9, or we go deeply into fixing the underlying issues to make sure that the software will work upstream, and not just in Gentoo (as otherwise there could be nasty surprises, like some I got, where an application works perfectly fine locally, where software is installed through Portage, and fails on Heroku that uses plain Rubygems). You can tell how that can be a PITA by looking at my github page — it lists mostly Ruby packages that I had to “fork” (branch, actually) to get the fixes in; mostly they have been merged upstream, sometimes they are dead in the water though.

All of this makes the situation quite complex; while I sort-of enjoy working with Ruby and these things, I also noted that it takes a very long time to get all the dependency web tested and fixed… and it’s the sort of time that, in my personal free time, I just don’t have. I have been packaging (and thus testing and fixing) a few packages that I triaged for a few job tasks, and some that I’m still using, using the paid work time, but that can’t cut it to work for every package out there. I guess the same thing goes on for Alex, Hans and Gordon.

What’s the bottom-line? Well, Hans in particular has been doing a huge work to port the ebuilds from the old gems.eclass to ruby-fakegem.eclass so that they can be installed when Ruby 1.9 is present without messing that up, even though they wouldn’t work with it. This makes the day that we can get it unmasked much nearer. But there are quite a few cases where we can’t just drop the old version so easily, and it mostly relates to non-gem bindings and the usage of Ruby as a scripting engine (rather than adding support for a library to Ruby itself). And this is without counting further issues like bundler not working altogether too well because it lacks dependency information, or getting Rubygems to refuse messing with the Portage-installed gems altogether (that is now much more feasible than before, since we no longer use the gem command from within Portage to install the stuff).

So what can you do to get this sooner? You can help out by making sure packages work with Ruby 1.9; when they have been positively tested not to work on that version, they are usually marked as so in the ebuild itself; for my part, I always note the problems with an Unicode right-pointed arrow, so running a fgrep command on the tree for “ruby19 →” should give you a very good idea of how many problems (and how many different problems there are out there).

You have no idea where to start with this thing? There is another option: hire me. Well, I would have liked to say “hire us”, but it turns out at least both Alex and Hans are not looking to be hired for this, while a project of mine is delivered this week and then I have some extra time for the next few months. I wouldn’t mind being paid to work full-time on getting Ruby 1.9-ready packages in the tree. I’m a registered freelancer in Italy so I have an European VAT ID and I can make proper invoices, so it’s going to be all clear in the books. If you’re interested you can contact me to discuss pricing and amount of work you’re looking for.

Just please, stop harassing the team because we’re not as fast as you’d like us to be… we’re already doing a hell of a job in a hell of a hurry!

Ruby-NG: Package in a Bottle (or, learn how to write a new Ruby ebuild)

I have to say that in the months we’ve been working on the new eclasses, I never went on describing properly how to use them. My hope was to write this documentation straight into the next-generation development manual for Gentoo, but since that project is far from coming, I’ll just rely on my blog for a little while more.

As described in my blog posts the idea behind the “new” (they are in tree for a few months already by now) eclasses is to be able to both handle “proper” Gentoo phases for packaging gems, and at the same time manage dependency and support tracking for multiple Ruby implementations (namely, Ruby 1.8, Ruby 1.9 and JRuby right now). How can we achieve this? Well, with two not-too-distinct operations; first of all we avoid using RubyGems as a package manager – we still use, in some cases, the gem format, and we always use the loader when it makes sense – and then we leverage the EAPI=2 USE-based dependencies.

Why should we not use RubyGems package management for our objective? With the old gems.eclass we used to encapsulate the install operation from RubyGems inside our ebuilds, but it was all done at once, directly into the install phase of the ebuild. We couldn’t have phases (and related triggers) such as prepare, compile, test and install. In particular we had no way to run tests for the packages at install time, which is one of the most useful features of Gentoo as a basis for solid systems. There are also other problems related to the way the packages are handled by RubyGems, including dependencies that we might want to ignore (like runtime dependencies injected by build-time tools), and others that are missing in the specification. All in all, Portage does the job better.

For what concerns the USE-based dependencies, when we merge a package for a set of implementations (one, two, three or any other number), we need its dependencies (at least, the non-optional ones) installed for the same set of implementations, otherwise it cannot work (this is a rehashing of the same-ABI, any-ABI dependencies problem I wrote about one and a half years ago). To solve this problem, our solution is to transforms the implementation into USE flags (actually, they are RUBY_TARGETS flags, but we handle them exactly like USE flags thanks to USE_EXPAND), at that point, when one is enabled for a package, the dependencies need to have the same flag enabled (we don’t care if a dependency has a flag enabled that is not enabled in the first package, though).

This actually creates a bit of a problem though, as you end up having two sets of dependencies: those that are used through Ruby itself (same-ABI dependencies) and those that are not (any-ABI dependencies), such as the C libraries that are being wrapped around, the tools used at runtime by system calls, and so on so forth. To handle this, we ended up adding extra functions that handle the dependencies: ruby_add_bdepend and ruby_add_rdepend, both of which “split the atoms” (yeah this phrase sounds nerdy enough), appending the USE-based dependencies to each. They also have a second interface, in which the first parameter is now a space-separated (quoted) list of USE flags the dependency is conditional to.

This is not the only deviation from the standard syntax that ruby-ng.eclass causes: the other is definitely more substantial: instead of using the standard src_(unpack|prepare|compile|test|install) functions, we have two sets of new functions to define: each_ruby_$phase and all_ruby_$phase. This ties into the idea of supporting multiple implementations, as there are actions that you want to take in almost the same way for all the supported implementations (such as calling up the tests), and others that you want to execute just once (for instance generating, and installing, the documentation). So you get one each and one for all function for each phase.

There are more subtle dependencies of course; in the call to the each type of functions you get ${RUBY} to be the command to call the current implementation, while in the all functions it’s set to the first-available implementation (this is important as we might not support the default implementation of the system). The end result is that you cannot call neither scripts, nor commands, directly; you should, instead, use the ${RUBY} -S ${command} format (for the commands in the search path, like rake, at least), so that the correct implementation gets called.

Oh and of course you cannot share the working directory between multiple implementations, most of the time, especially for the compiled extensions (those written in C). To solve this problem, at the end of the prepare phase, we create an implementation-private copy of the source directory, and we use that in the various each functions; to be on the safe side, we also keep a different source directory for the all functions, so that the results from one build won’t cause problems in the others. To avoid hitting performance too much here, we actually do exactly two tricks: the first is to use hardlinks when copying the source directories (this way, the actual content of the files is shared among the directories, and only the inodes and metadata is duplicated); the second is to invert the order of the all/@each@ calls on the prepare phase.

While in all other cases all is executed after the implementation-specific functions, the all phase is executed before the other prepare functions… which are preceded by the copying, of course. This means that the changes applied during the all_ruby_prepare function are done over the single generic directory and then is copied (hardlinked) to the others.

So this covers most of the functionality of the ruby-ng.eclass, but we had another tightly-related eclass added at the same time: ruby-fakegem.eclass. Like the name let you guess, this is the core of our ditching RubyGems as a package manager entirely. Not only it gives us support for unpacking the (newer) .gem files, but it also provides default actions to deal with testing, documentation and installation; and of course, it provides the basic tools to create fake RubyGems specifications, as well as wrapping of gem-provided binaries. An interesting note here: all the modern .gem files are non-compressed tarballs, that include a compressed metadata YAML file, and a compressed tarball with the actual source files; in the past, there has been a few gems that used instead a base64/mime encoding for sticking the two component files together. For ease of maintaining it, and for sanity, we’ve decided to only support the tarball format; the older gems can be either fixed, worked around or replaced.

The boilerplate code for ruby-fakegem assumes that most gems will have their documentation generation, and tests, handled through means of rake; this is indeed the most common situation, even though it’s definitely not the same situation among different projects. As I said before, Ruby’s motto is definitely “there are many ways to skin a cat”, and there are so many different testing frameworks, with different task names, that it’s not possible to have the same exact code to work for all the gems unless you actually parametrise it. The same goes for the documentation building, even when the framework is almost always the same (RDoc; although there are quite a few packages using YARD nowadays, and a few that are using Hanna — which we don’t have in tree, nor will support, as it requires a specific version of the RDoc gem. an older one). The result is that we have two variables to deal with that: RUBY_FAKEGEM_TASK_TEST and RUBY_FAKEGEM_TASK_DOC which you can set in the ebuild (before inheriting the eclass) to call the correct task.

Now, admittedly this goes a bit beyond the normal ebuild syntax, but we found it much easier to deal with common parameters through variables set before the inherit step, rather than having to write the same boilerplate code over and over… or have to deduce get it directly from the source code (which would have definitely wasted much more time). Together with the two variables above we have two more to handle documentation: RUBY_FAKEGEM_DOCDIR that is used to tell the eclass where the generated documentation is placed, so that it can be properly installed by the ebuild, and RUBY_FAKEGEM_EXTRADOC that provides a quick way to install “Read Me”, ”Change logs” and similar standalone documentation files.

Finally, there are two more variables that are used to handle more installation details. RUBY_FAKEGEM_EXTRAINSTALL is used to install particular files or directories from the sources to the system; this is useful when you have things like Rails or Rudy wanting to use some of the example or template files they are shipped with, at runtime; they are simply installed in the tree like they were part of the gem itself. RUBY_FAKEGEM_BINWRAP is the sole glob-expanded variable in the eclass, and tells it to call the “binary wrapper” (not really binary, but rather scripts wrapper; the name is due to the fact that it refers to the bin/ directory) for the given files, defaulting to all the files in the bin/ directory of the gem; it’s here to be tweaked because in some cases, like most of the Rudy dependencies, the files in the bin/ directory are not really scripts that are useful to be installed, but rather examples and other things that we don’t want to push in the system’s paths. It also comes useful when you might want to rename the default scripts for whatever reason (like, they are actually slotted).

What I have written here is obviously only part of the process that goes into making ebuilds for the new eclasses, but should give enough details for now for other interested parties to start working on them, or porting them even. Just one note before I leave you to re-read this long and boring post: for a lot of packages, the gem does not provide documentation, or a way to generate it, or tests, or part of the datafiles needed for tests to run. In those cases you really need to use a tarball, which might come out of GitHub directly, if the repository is tagged, or might require you toy with commit IDs to find the correct commit. Yup, it’s that fun!

Mistakes to make your Gem a PITN

Because of course we all have pains in the neck and nowhere else. And let me warn you: the timeframe in this post is messed up because I wrote it in the past week or so trying to avoid ranting every day; so when I say “today” or “yesterday”, it’s quite relative

So I’m still going on with the ports to ruby-ng of the dev-ruby ebuilds; I gave priority to the packages that are needed for Typo (which is the software I use for this blog) so I could also test them properly first hand. The results aren’t excessively bad I’d say, actually the space wasted on my server was reduced; I had to make a fix to my remove-3rdparty Typo branch to be able to use the new will_paginate gem (since the new version in gemcutter does not have the mislav- prefix like the one from github had), but the result is, after all, not bad at all.

As I said before there are quite a few common problems with gems that I’d like to point out, so that future Ruby Gem developers will try to avoid repeating such mistakes:

  • missing licensing information, or too much licensing information: while Gentoo is definitely not Debian, nor we’re Fedora, and thus we tend to have a much more “relaxed” approach in term of licensing for all the packages falling out of the system set (since we don’t redistribute binaries but only sources), it’s not really that nice when a package is lacking all kind of licensing information, or where the LICENSE file and the README file provide conflicting licensing information;
  • missing tests, specs, or missing data files: I found quite a few gems that don’t package their tests (or specs if they use rspec), or that package the spec files but not the datafiles they use; this is quite a problem since it makes the package unverifiable; this gets even more nasty when the files are not even in the upstream repository!;
  • bogus build-time dependencies: this almost happened today with addressable (even though, kudos to its author, addressable itself is quite nice to work with!), as gemcutter reports a build-time dependency over launchy … and indeed it’s there, in declaration; on the other hand the only thing it’s used for is opening the rcov output in a browser, and that’s not even needed to run the Rakefile; the bottom-line is that I didn’t need to push launchy and configuration in the main tree (both had the previous noted problems) even though the gems declare them as needed — this is one reason why I don’t think we should “auto-generate” ebuilds;
  • single version dependencies: if you only allow a precise version of another gem to be used with yours, you’re going to play catch-up ad infinitum to make sure that your gem is usable; Hans was bitten by this with cucumber, and I sidestepped it in activewebservice for Typo, since the gem required Rails 2.3.3 (or 2.3.4 depending on the version) while I wanted to run the last secure one (2.3.5);
  • fork the code like bunnies: this is probably caused by the bad habit of GitHub to propose forking code continuously; the already-mentioned actionwebservice is quite problematic from that point of view: 2.3.3 version was released by datanoise, 2.3.4 by dougbarth and we now got two possible gems for 2.3.5;
  • use a not maintained documentation system: while RDoc is almost standard in the Ruby world, I found two packages last night (sinatra and rack-test) that use an alternative system based on haml: Hanna which unfortunately only works with an older version of RDoc; for this reason, the documentation for neither is built and installed.

There are probably more common mistakes that you have to look forward for in the future, but this at least is the first list, although it’s probably rehashing most of the stuff I have ranted about in the past.

And one important announcement here: if you care about Ruby 1.9, you should know that we’re currently in a huge mess: a lot of code fails tests with Ruby 1.9 so I’ve been dropping support for it everywhere it cannot be tested. Similarly it happens for JRuby, even though for different reasons often. And to test all the packages to re-add Ruby 1.9 when needed is going to take quite a bit of time. For this reason, you really should either try to help out by testing it yourself, or find a way to support us through the job (I can be hired for the task).

ModSecurity, antispam and books

You probably remember that I wrote quite a bit about my use of ModSecurity to handle antispam for the blog’s comments as it allows me to verify the User-Agent header as well as having a few extra tricks up my sleeves without enabling either forced registration, captchas or comment forced moderation. Among the other things, it allowed me to also disable the 60-days limit for comments on posts: now all the posts have free comment enabled.

But I think I already ranted about the lack of good documentation about ModSecurity: while it’s definitely powerful, it also has a few rules that are definitely draconic, and that makes it almost impossible to use it without fiddling for most use cases. Part of this has, in my opinion, to do with the idea ModSecurity was designed for in the first place: putting a stop to vulnerabilities of broken PHP code. I’m not singling out PHP here, they did, more than a couple of rules are designed to workaround common PHP code errors. While this can probably be considered good enough, it shows its problems when used with Rails (for instance the “duplicate parameter” rules break Rails pretty badly). For this reason in Gentoo, by default, I disable some of the worst rules (you can still get the original by using the vanilla USE flag).

Now, earlier this month, before my one-week vacation, Packt Publishing asked me to review a book (that they published last week) on the subject: ModSecurity 2.5 by Magnus Mischel . I’m still reading through it, given my usual time constraint (and a few unusual ones, including my birthday yesterday), but I can say something about it already: give it a read.

It starts from quite some basics in the functioning of ModSecurity, and that is very good, as it’s exactly what the original documentation lacks. At the start I actually had the wrong impression that it was going to take a too “newbie” look to the thing, but indeed there are some very basic tricks that might not be obvious at all even though you’ve been roaming through the ModSecurity documentation for a while before.

You can say that reading this book has been pretty helpful to both me and Gentoo: from one side I’m understanding how to improve the antispam rules so that they can be published and made available for others to use (I’m considering publishing my own rule set, not only for the antispam, but also as a measure of protecting against marketing crawlers that waste everybody’s bandwidth); from the other side, there has been at least one dependency (over mod_unique_id) that I didn’t know about, but which is now fixed in the ebuild you can find in tree.

Bottom line is, if you’re planning of doing any serious work with ModSecurity, this is definitely a must-read text. Kudos to Magnus, his work is definitely quality work. You can get the book and PDF directly from Packt or get it from Amazon (associate link) if you prefer.

And thanks again to Packt (and Magnus) for the opportunity of improving the Gentoo packaging: I know now of a couple more things I should be looking at to fix in the next future.