Gentoo binhosts notes

I’ve been meaning to write about this before, and since I was asked of something related by the Gechi I thought this was as good as a moment as it could be, for what concerns this topic. You probably all know that I run a tinderbox that is trying to ensure that the packages in Gentoo build, and work at least as intended. But as the description page says, the same name has been used by other projects, such as the official Infra one which provides logs, reverse dependency information, and binary packages for various architectures.

Given that, some people asked me before if I can provide binaries of the packages built by my tinderbox; the answer is “not a chance”, for a few reasons that are more complex than the average first glimpse would tell you. And it’s mostly beside the basic problem that Gentoo has very shabby support for binary packages, as both Fabio and Constanze could easily tell you.

In the set of problems, the issue at hands is licenses (and here the expert would probably be Matija as he’s quite interested in getting them right). Not only the tinderbox accept almost all the possible licenses that are available in Portage so that it can actually merge as many packages as are available, but it also does not turn the bindist USE flag on (which is used to make sure that the result is actually distributable; it disables feature that would link GPL-incompatible libraries into GPL software; it disables feature that are known patented and shouldn’t be used; it make sure that trademarks are respected — like Firefox). Both these issue make a good deal of generated binaries non-redistributable by themselves.

But that’s not all; even when the license would let me redistribute the binaries; copyleft licenses require me to redistribute the sources as well; just pointing at the Gentoo mirrors is not a good option there; I would have to provide all the downloaded and used distfiles, and all the ebuilds as used for build that binary. You might guess it’s not an easy task nor one that requires just a bit of online space.

Now, after you actually tackle all these issues, enabling the bindist USE flag, only accepting true opensource licenses that allow redistribution, and providing a way to publish all the sources together with the binaries, you’re still not set. The rest of the problems are actually technological ones, tied to the way my tinderboxing scripts are designed to work. Even without counting the fact that the flag combinations are so many that you have to limit to some sanity, actually building every package in the tree gives me a number of headaches, starting with a number of automagic dependencies which would make the binary packages unusable.

On a side note, I’ve been thinking for a while of getting a way to set up dependency verification to ensure that no automagic dependencies enter the ELF files at least; unfortunately this is not as straightforward as I’d like it to be. New-style virtuals mean that the dependency is hidden under a second layer of packages, which in turn would make it difficult to actually pinpoint the error conditions. Think of the PyCURL trouble that I pointed at a few weeks ago.

What would actually work to produce decently-working binary packages is the method that Patrick has been using for tinderboxing: build one by one each package with only its own detailed dependencies merged in. After all, this is the method that the Debian- and RPM-based distributions follow. This has the opposite side effect of maybe failing if an indirect (transitive) dependency is missing, which often happens with pkg-config, but at least you wouldn’t find automagic dependencies on the binary packages themselves.

And as I noted in passing above, there are a number of problems related directly with the way Portage manages binary packages by itself: it does not have a properly clean way to track actually needed dependencies (ABI-wise that is) – and let’s not even go about the tracking of python dependencies – and the format itself is not really flexible enough, which causes headaches when trying to deal with special features like file-based capabilities.

So, good luck if you want to provide public binhosts, myself I don’t have time nor will to even think about them, for the most part.

Ruby-NG: Bin Man (or, the binwrapper problems)

One of the problems that we definitely need to hash out before we start marking as stable the ebuilds based on the new Ruby eclasses is the handling of the current “binwrappers”. I dislike the name sincerely — while they are obviously in the bin directory for the gems, they are definitely not binaries, but rather executable scripts. Sigh, let it be for now though.

RubyGems already creates a wrapper by itself, so that it calls the correct (latest) binary for a given gem. On the other hand we don’t use that wrapper, but a different one that can be generated by the ebuild with much more stable targets. The end result is generally pleasing, as we can use the same wrapper for any implementation the gem is installed for. But here start the trouble.

Right now, this works all fine only if the gem is installed for every implementation, or at least every installed implementation. This again is mostly correct for most users as they will only have Ruby 1.8 installed. It starts being a bit different for JRuby, as not all of those scripts can be launched through that (but on the other hand, since we cannot set JRuby up as default Ruby implementation with eselect, it shouldn’t be much of a problem). It will be come a problem when we’re going to have Ruby 1.9 fully supported in Gentoo, as setting it up as default Ruby provider for the system will cause most of the scripts, installed only for Ruby 1.8, to fail.

The problem described above is to be intended when a package lacks an implementation, but conversely the problem applies when a package is available only for an implementation. Take for instance the (for now unpackaged) Duby — a strongly-typed Ruby-inspired scripting language developed by JRuby developer Charles Nutter. It will only ever be available for JRuby (minus possible reimplementations) as it generates Java Bytecode that JRuby can execute. It also has a duby script, but the ebuild I have here installs a broken wrapper: it calls into /usr/bin/ruby, but of course that cannot ever be JRuby, problems ensures.

Another problem is what Hans tried to solve some time ago: when multiple slots of the same gem are installed, and they all install the same named commands, how do you choose between them? Most of the times, you install them slotted, so you got cmd-${SLOT} named commands around, but you also need to have a way to just call cmd and have it work. Hans worked on eselect-gem for that reason: it’s a generic approach to the same thing that eselect-rails does. Right now, we’re not integrating well with that, so we might need to find a way to handle that.

One of the reasons why I’m now writing all this about the wrappers, is that I’d love for people to comment (after looking at the implementation, possibly, as I’d seriously love to avoid noise due to users wishing ponies, or detractors just saying that RubyGems is perfect — it’s not), with possible approaches we can take. So, comments welcome! And you might want to use the pre HTMLish tag to submit code via the comments, so that it won’t be screwed up by the formatting. You can also use at-symbols for inline code keywords, like I’ve done in the post.