This Time Self-Hosted
dark mode light mode Search

Ruby and Automagic Dependencies

I have written last week about one real-world case of automagic dependencies screwup (as well as having started the automagic fixing guide a long time ago). One would expect that wi Ould rather not see automagic dependencies anywhere, but that wouldn’t be quite correct, since the whole topic is quite trickier than this.

Indeed, in Ruby (as well as Perl and Python and almost all dynamic languages), automagic dependencies are well part of the game: you load code and features conditionally to the presence of the needed libraries on the system). There is, though, one very important distinction to make here: automagic dependencies for dynamic languages happen directly at runtime, not at build time. This means that it varies on the running system’s configuration rather than the build system’s.

This distinction is particularly important; when you need to rely, in an ebuild, on a feature that is enabled automagically at build-time, you have nearly no way to do so: you have to check for presence and abort in case, which is bad. With runtime-automagic dependencies, you just need to add as a dependency the library needed to provide the function, and you’re done.

Let’s see a common example in Ruby packaging. The echoe gem provide some common packaging tasks, wrapping around features like rcov coverage and rspec testing. You usually need Echoe at build time when you need to access the Rakefile (so for doc generation, some testsuites, and build of native extensions); if you need to execute the RSpec-based testsuite, you’d obviously need RSpec as well, you depend on both, and Echoe “enables” the conditional support for running that testsuite. Done, you don’t need to reinstall Echoe, you, thus, don’t need an USE flag on Echoe. And if you just needed to rebuild the documentation, well, you won’t be needing RSpec anyway!

_Note this is a simplification; Echoe up to the 4.3 release didn’t automagically depend on RSpec, but simply required it as soon as the package using Echoe had spec files available; I’ve fixed this for Gentoo and is now upstreamed since 4.3.1 so that we don’t have to depend on RSpec even for documentation building. And, actually, we’ve started using RSpec directly rather than Rake to run the easiest testsuites because sometimes the Rakefiles are just full of shitty dependencies that it’s easier to ignore them than fix them. If I do so, I usually also patch the Rakefile upstream so that in the future we can migrate out of that custom code and back into generic Rake calling._

Another example: the Storable gem allows the use of JSON data structures, via either the JSON gem or the YAJL-Ruby library. By default, it needs either, and won’t support the JSON data; if you’ve got a package that needs the JSON support in Storable, rather than having an USE flag on Storable you should just depend on both Storable and either of the two JSON libraries. Easy as pie, even though it might sound a bit cumbersome.

The reason why I stress that these things shouldn’t have USE flags is that we want USE flags to be, generally, absolute: if I built Storable with USE=-json I’d expect for it not to support JSON. But as it is, it might very well support it, if either JSON or YAJL-Ruby were installed as dependency of another package. Patching the code to disable it forcefully would be silly, and blocking the packages that could be used, would just increase complexity.

So for most Ruby libraries, we have to accept automagic dependencies as part of the game. But is that alright? Does it not add problems? I actually know it does, and it’s one of the main reasons why Ruby-NG ebuilds have, unfortunately, varied widely since I’ve started working on them last year.

The first problem arrives when you look carefully at the automagic Rakefiles: so many of them enable testsuites only if RSpec is present, documentation building only if a given documentation system is present (RDoc2, Hanna, …), GemSpec building if Jeweler is present and so on so forth. This is good because it means we don’t have to depend on, say, the documentation libraries when we’re not building documentation, but it also creates situations in which we can’t be sure that our ebuild works: what if a gem we haven’t installed is installed on the user’s system? It might even not be in our tree, but only in some overlay. What instead if something is missing and we forgot to add a dependency? Does the documentation building switch off clearly, if the needed libraries are not found?

I guess the commonest situation up to now is related to gems using Jeweler; like Echoe, Jeweler is a middleware used for packaging purposes; unlike Echoe, we don’t usually need it at build time because it only takes care of the proper Gems tasks (and we, generally, don’t care about the gem metadata). We’ve been able, up to now, to stay away from packaging Jeweler at all, which is good given that it’s quite complex to package. I guess we have to thank that the examples of usage for Jeweler shown it how to make it properly optional.

Unfortunately, even though rake -D will complete properly when you don’t have Jeweler installed, and documentation building usually work just as fine, it fails as soon as you try to run the test or spec task (the testsuites). Why? Because the Rakefile adds a prerequisite task to the testsuite: check_dependencies, provided by Jeweler, which is used to make sure that the gem is generated with the correct dependency information. It’s usually easy to fix: in Gentoo we can just sed away the line (or run the testsutie manually), and upstream we move the prerequisite declaration to the conditional block added when Jeweler is properly found. But easy or not, it’s bothersome.

Where it’s less of a hassle but more of an open question on how to handle it is with optional dependencies in testsuites. Whenever the package have automagic dependencies, you may have two opposite methods to handle testsuites: you might request that all the optional dependencies are available, so that they all get tested, or you might exclude the tests for those packages that couldn’t be found. The former option does not leave us packagers with an choice, the latter leave up to us to decide whether we want to test the deepest or stay shallow.

The obvious choice for extra safety is to always test the deepest; but this is not safe from problems. What happens if the packages are not available? The automagic, conditional code might not work well in the opposite situation (which is just as bad as the code failing). What if the packages are not available for a given implementation? While we can (and do, for some packages) add dependencies just for one or two implementations, it’s cumbersome. And most importantly, adding all the optional packages to the dependencies makes the dependency tree bigger and much more complex. For instance, rack would use during its test phase almost all the available Ruby-based webservers. Do you think it’d be okay to add all of them as dependencies? (And this is without saying that some of those depend on Rack!)

I hope you can now see why it is still quite a problem for us to properly package Ruby libraries, and why even if it’s just one or two packages, I often end up working a whole day on Ruby ebuilds to make sure they work as intended… it should also show given the number of repositories I have on GitHub (none of which I started; they are all “forks” of packages I fixed for Gentoo).

Comments 2
  1. Would it be possible to create a build-time wrapper around rubygems that only resolves gems that have been given in DEPEND (and their dependencies) ? Maybe this wouldn’t work so well in practise but it’s just a thought.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.