Bye bye, Mongo

It was not even six months ago that I written about my monster — the web app that used Mongo and sent PDF to be printed to my home server over IPv6 Internet. But still, the monster is now dead; I’m now considering whether I want to unleash the sources to the world or whether I should just bury it altogether.

What’s going on? Well, from one side I stopped working as a MSP — it doesn’t pay enough for me to actually pay taxes as well as live in Italy. Not working on that anymore I no longer have the need to actually have a database of my customers’ computers. On the other side, I’m having quite a few concerns regarding my choice of MongoDB as a data backend.

So MongoDB seems to be great for rapid prototyping: it allows you to work without having to define a schema you might not know beforehand, and it still allows you decent performance by declaring indexes — which seems to be the one feature that most NoSQL-era databases seem to ignore, if not shun, altogether. But even that seems to lose its traction in front of things like Amazon’s SimpleDB, that I’m currently using for tinderbox log analysis with quite a bit of success.

This is not much because, once you actually finish with the design, the database is solid as a rock, but rather because using MongoDB actually involves quite a bit of work in administration that, honestly, I find hard to justify even for a pet project of mine. First of all there are the dependencies, which include v8 (but that seems to be the case for execjs as well); then there is the fact that MongoDB was not really designed to work in an Unix environment.

This seems to be another constant of NoSQL approaches: not only they laugh at the whole SQL idea (which served us for scores of years, and seems to still serve us fine for the most part), but also at other Unix conventions, such as the FHS or syslog, or simply the format of command-line parameters.

More importantly to me, though, is the bad attitude that the upstream developers have, toward proper versioning and compatibility. Why do I say this? Well, you probably remember my rant about their Ruby bindings and the way they mixed two extensions’ in the same git repository. That’s just the tip of the iceberg.

My monster above actually used Mongoid for the Rails support – Durran did a terrific job with the project! – but I’ve been keeping on version 2, rather than 3, because the new version is Ruby 1.9 only (and JRuby in 1.9-mode, but that’s beside the point). I was looking forward for 3 anyway because it’s based not on the original gems, but on new code written by Durran, who has much more clue about Ruby. While waiting, though, I lost interest in MongoDB due to the way the original gems became even worse than they were before — and all of this is not caused by Durran, but it’s despite his huge effort to make MongoDB work nice in Ruby.

What happened? Gentoo currently provides the bson gem, and the version-synced mongo gem, up to version 1.6.2 — after that, they broke their API — and not by calling the new version 2.0, or 1.7… it’s 1.6.3, followed by 1.6.4! This wouldn’t be that bad if upstream actually fixed this quickly by either releasing a new version with the API that were removed, so we could just ignore the two broken release, or at least release a 1.7 so that the new software could use that, and we could slot it, while keeping the old software on version 1.6.2 (“revoking” the other two).

But upstream seem to have no intention to continue that way; so right now Gentoo looks like lagging behind on three gems (bson, mongo and mongoid) due to this problem. Why mongoid? Well, Durran couldn’t make a new release of 2 working with the new bson gem because they removed something he was using. And since he’s pouring all his work into version 3, he’s not interested in trying to fix a bigger screwup from upstream — so the version of mongoid that is not in Gentoo, and likely will never be, is just changing the gem’s dependencies to require bson 1.6.2 and nothing later.

I’m sorry but if this is the way MongoDB is developed, I have no intention to put my own data to risk with it. Tomorrow I’ll probably export all the data from my current MongoDB database into some XML file, so I can access it and then I’ll just stop the app and remove the MongoDB instance altogether.

On test verbosity (A Ruby Rant, but not only)

Testing of Ruby packages is nowadays mostly a recurring joke to the point I didn’t feel I could add much more to my last post — but there is something I can write about, which is not limited to Ruby at all!

Let’s take a step back and see what’s going on. In Ruby you have multiple test harnesses available, although the most commonly used are Test::Unit (which was part of Ruby 1.8 and is available as a gem in 1.9) and RSpec (version 1 before, version 2 now). The two of them have two very different defaults for test output: the former uses, by default, a quiet “progress” style output (one dot per test passed), the latter uses a verbose “documentation” style. But both of them can use the other style as well.

Now, we used to just run tests through rake, using the Rakefile coming with the gem, or the tarball, or the git snapshot, to run tests and build documentation, since both harnesses have very tight integrations with it — but since the introduction of Bundler, this starts to get more troublesome than just writing the test commands out in the ebuild explicitly.

Today I went to bump a package (origin, required for the new Mongoid), and since I didn’t wire in the tests last time (I was waiting for a last minute fix), I decided to do so this time. It should be of note that while, when using rake to run the tests, you’re almost entirely left to the upstream’s preference on what style to use for the tests, it’s tremendously easy to override that in the ebuild itself.

The obvious question is then “which of the two styles should ebuild use for packaged gems?” — The answer is a bit complex, which is why this post came up. Please also note that this does not only apply to Ruby packages, it should be considered for all packages where the tests can be silent or verbose.

On a default usage, you only care about whether the tests fail or pass — if they fail, they’ll give you as much detail as you need, most of the times at least, so the quiet (progress) style is a very good choice: it’s less output, which means smaller logs, less context, and it’s a win-win. For the tinderbox, though, I found out that having more verbose output is useful: to make sure that other tests are not being skipped, to make sure that the build does not get stuck, and if it does, where it got stuck.

Reconciling these two requirements is actually easier than one might think at first because it’s an already solved problem: both perl.eclass and, lately, cmake-utils.eclass support one variable, TEST_VERBOSE which can be set to make the tests output more details while running. So right now origin is the first Ruby ebuild supporting that variable, but talking about this with Hans we came to the conclusion we need a wrapper to take care of that.

So what I’m working on right now is a few changes to the Ruby eclasses so that we support TEST_VERBOSE and NOCOLOR (while the new analysis actually strips out most of the escape codes it’s easier if it doesn’t have to do that!) with something such as ruby-ng-rspec call (which can also check whether rspec is properly in the dependencies!). Furthermore I plan on changing ruby-fakegem.eclass so that it also allows to switch between the current default test function (which uses rake), one that uses RSpec (adding the dependency!), and one that uses Test::Unit. Possibly, after this, we might want to do the same thing with the API documentation building, although that might be trickier — in general though having the same style of API docs installed, instead of using each project’s own, might be a good idea.

So if your package runs tests, please try supporting TEST_VERBOSE, and if it can optionally use colors, make sure it supports NOCOLOR as well, thanks!