What I’d like from my blog

My blog is, at this point, a vital part of my routine. I use my blog to write about my personal projects, I write about the non-restricted parts of my jobs, and I write about the work that goes into Gentoo Linux and other projects I follow.

I have over 2100 posts over time, especially thanks to the recent import of my original blog on Gentoo infrastructure. I don’t really know if it’s a lot, but sometimes Typo seems to miss something about it. Unfortunately I’m also running an older version of Typo, because I haven’t switched that virtual server to Ruby 1.9 yet as one of my customers is running a version of Radiant that is not going to work otherwise.

Said customer also bitched so hard, and screamed not to keep the site on my server, but as it happens the new webmasters that are supposed to pick up the website, and should have been cheaper and faster than me… have been working since June and still delivered nothing. Hopefully they’ll be done soon and I can kick said customer from the server.

Anyway, at this point there are a few things that I’d like to get out of my blogging platform in the future, which might require me to fork Typo and create my own version, which is likely going to be stripped down — as many things I really don’t care about, that are added here, like the short URLs, which I might just export as I think I used them at some point, but then I would handle through mod_rewrite rather than on the Rails side.

So let’s see what I don’t like about the current Typo I’m using:

  • The database access is more than a bit messed up; it probably has to do that upstream only cares about MySQL, while I want to run it on PostgreSQL; and this causes more than a couple of problems — have you noticed that sometimes my posts end up password-protected? Well, what happens is that the settings for the single posts are serialized in YAML and de-serialized, but somethings something bad happens and the YAML becomes invalid, causing the password-protection to kick in. I know there is an ActiveRecord extension that allows for key-value pairs to be stored in PostgreSQL-specific column types instead of having to (de)serialize them all the time, but again, this wouldn’t be something upstream would use.
  • Alternatively I’ve been toying with the idea of using MongoDB as a backend. Even with the issues that I have pointed out before, I think it might work well for a blog, especially since then the comments would be tied tot he post itself, rather than have the current connected tables.
  • There is a problem with the tags handling, again something upstream doesn’t seem to care about – at some point I remember reading they were mostly interested in making every single word in a post a tag to cross-connect posts with the same word; it’s one of the reasons why I’m not sure if I want to update it. If I change the title of one of the tags to make it more descriptive, then I edit a post that has that tag, it creates one more tag for each word in that title, instead of preserving the older tags. I really should clean up the tags I got right now.
  • I would also like that when I get to the “new post” page it would create it already and then get me back to editing it — this is important to me because sometimes if I have to restart Chromium, or suspend the laptop, something goes very wrong and it creates multiple drafts for the same post. And cleaning them up is a long task.
  • A better implementation of notification for new posts, and integration with Flattr, would be also very good. While IFTTT makes it easy to post the new entries to Twitter and LinkedIn, its lack of integration for Flattr is a major pain, and the fact that right now, to use auto-submit, I have to duplicate part of the content in the HTML of the pages, is also a problem. So being able to create a “Flattr thing” the moment when I actually post something would be a major plus for me.
  • Since I’m actually quite the paranoid, another thing I would like to have would be either two-factor authentication with Google Authenticator on a cellphone, or (actually, in addition to) certificate-based authentication for the admin interface. Having a safe way to make sure that I’m the only one logging in would make me remove some of the administrative interface rules on ModSecurity, which would in turn let me write posts from public WiFi networks sidestepping the problem I posted about the other day.
  • Scheduled posting. This used to be supported, but it’s been completely broken for years at this point, but it was very useful to me a long time ago since I would just write a bunch of posts and schedule them to be posted once a day. I suppose this should now be changed so that the planned posts are only actually posted if a process is called to make sure that the new posts are now “elapsed”… but again this is something that I’d like to have, and you readers would probably enjoy, as it would probably make for more and better content overall.

I definitely do not want to go with WordPress, I just wish I had the time to write my own Typo fork, and make it more usable for what I do, rather than hoping that the upstream development for Typo does not go in a direction I don’t like at all.. Maybe somebody else has the same requirements and would like to join me in this project; if so, send me an email.. maybe it’ll finally be the time I decide to start on the fork itself.

MongoDB, after a Meetup

Yesterday night I had my first glimpse of what can be called, stretching it, a social life. I was invited by one of the 10gen people to go to a MongoDB meetup/party held in honour of the release of MongoDB 2.2 — I came to know about this after I was ranting on Twitter about the new point release for 2.0 is still using the outdated Boost filesystem version 2.

So first of all thanks to 10gen for the drinks and chips — the event was fun and it was nice to meet people from around here, although I was surprised in the seemingly scarceness of Linux users… too bad that Davide couldn’t be there because he was working…

You probably remember I was quite a bit disappointed in MongoDB when I tried it out, not for its own design but for some nuances that make it a good fit or not in an Unix environment — one of which (the syslog support) is finally solved on the new release, 2.2. But most importantly because of the way the Ruby driver has been managed.

Now I didn’t expect much of a technical meeting since it was intended as a party, but the main answer I have been given for the trouble (well, they asked, when I said I was sceptical) was that “the drivers are being worked on just recently”. I have to say that if it wasn’t for Matthew, I would have felt a bit disappointed in the handling… but since he at least engaged in a technical discussion at least on the small run, I actually got a possible reasons for the situation. the “just recently” meant that the drivers have been put on feature parity … and seems like somebody intended it as in “drop features that are not available elsewhere” without considering the API compatibility issue.

Also kudos for him taking time to look whether the Boost 1.50 compatibility was fixed in the new release or not (seems like it is, so that’s good for Gentoo).

The feeling I got is that my doubts on MongoDB are not really just me — I heard many people saying that they like the idea, and they are using it for side projects but … they don’t feel ready to prefer it over traditional solutions at the moment for their main systems. Some are having trouble with the way it handle small amount of data (too big an overhead), others the way it handles complex queries (spidermonkey is too slow, v8 is too unstable). So I guess I’m not more sceptical than the average.

My personal complain right now? I doubt I’ll have time to try the one experiment I’d like to see: converting Typo to use MongoDB — with my usual rate of writing, this blog has now over 1740 articles.. which enough comments already in – and I don’t get hit by spam so much thanks to my ruleset – but it still is sluggish from time to time. Given the kind of data that a blog is composed of, I think there is something to gain to go with an object tree model instead of flat tables… but I don’t have the time to work on this any time soon. Maybe when I’ll have time to play.

All in all, I still think that if there is an alternative (when the use cases make sense of course) to your average RDBMS, MongoDB at the moment is your best bet, which might say something about what the remaining bets are.

Bye bye, Mongo

It was not even six months ago that I written about my monster — the web app that used Mongo and sent PDF to be printed to my home server over IPv6 Internet. But still, the monster is now dead; I’m now considering whether I want to unleash the sources to the world or whether I should just bury it altogether.

What’s going on? Well, from one side I stopped working as a MSP — it doesn’t pay enough for me to actually pay taxes as well as live in Italy. Not working on that anymore I no longer have the need to actually have a database of my customers’ computers. On the other side, I’m having quite a few concerns regarding my choice of MongoDB as a data backend.

So MongoDB seems to be great for rapid prototyping: it allows you to work without having to define a schema you might not know beforehand, and it still allows you decent performance by declaring indexes — which seems to be the one feature that most NoSQL-era databases seem to ignore, if not shun, altogether. But even that seems to lose its traction in front of things like Amazon’s SimpleDB, that I’m currently using for tinderbox log analysis with quite a bit of success.

This is not much because, once you actually finish with the design, the database is solid as a rock, but rather because using MongoDB actually involves quite a bit of work in administration that, honestly, I find hard to justify even for a pet project of mine. First of all there are the dependencies, which include v8 (but that seems to be the case for execjs as well); then there is the fact that MongoDB was not really designed to work in an Unix environment.

This seems to be another constant of NoSQL approaches: not only they laugh at the whole SQL idea (which served us for scores of years, and seems to still serve us fine for the most part), but also at other Unix conventions, such as the FHS or syslog, or simply the format of command-line parameters.

More importantly to me, though, is the bad attitude that the upstream developers have, toward proper versioning and compatibility. Why do I say this? Well, you probably remember my rant about their Ruby bindings and the way they mixed two extensions’ in the same git repository. That’s just the tip of the iceberg.

My monster above actually used Mongoid for the Rails support – Durran did a terrific job with the project! – but I’ve been keeping on version 2, rather than 3, because the new version is Ruby 1.9 only (and JRuby in 1.9-mode, but that’s beside the point). I was looking forward for 3 anyway because it’s based not on the original gems, but on new code written by Durran, who has much more clue about Ruby. While waiting, though, I lost interest in MongoDB due to the way the original gems became even worse than they were before — and all of this is not caused by Durran, but it’s despite his huge effort to make MongoDB work nice in Ruby.

What happened? Gentoo currently provides the bson gem, and the version-synced mongo gem, up to version 1.6.2 — after that, they broke their API — and not by calling the new version 2.0, or 1.7… it’s 1.6.3, followed by 1.6.4! This wouldn’t be that bad if upstream actually fixed this quickly by either releasing a new version with the API that were removed, so we could just ignore the two broken release, or at least release a 1.7 so that the new software could use that, and we could slot it, while keeping the old software on version 1.6.2 (“revoking” the other two).

But upstream seem to have no intention to continue that way; so right now Gentoo looks like lagging behind on three gems (bson, mongo and mongoid) due to this problem. Why mongoid? Well, Durran couldn’t make a new release of 2 working with the new bson gem because they removed something he was using. And since he’s pouring all his work into version 3, he’s not interested in trying to fix a bigger screwup from upstream — so the version of mongoid that is not in Gentoo, and likely will never be, is just changing the gem’s dependencies to require bson 1.6.2 and nothing later.

I’m sorry but if this is the way MongoDB is developed, I have no intention to put my own data to risk with it. Tomorrow I’ll probably export all the data from my current MongoDB database into some XML file, so I can access it and then I’ll just stop the app and remove the MongoDB instance altogether.

Ruby pains, May 2012 edition

While I’m still trying to figure out how to get the logs analysed for the tinderbox, I’ve been spending some time to work on Gentoo’s Ruby packaging again, which is something that happens from time to time as you know.

In this case the spark is the fact that I want to make sure that my systems work with Ruby 1.9. Mostly, this is because the blog engine I’m using (Typo) is no longer supported on Ruby 1.8, and while I did spend some time to get it to work, I’m not interested in keeping it that way forever.

I started by converting my box database so that it would run on Ruby 1.9. This was also particularly important because Mongoid 3 is also not going to support Ruby 1.8. This was helped by the fact that finally bson-1.6 and mongo-1.6 are working correctly with Ruby 1.9 (the previous minor, 1.5, was failing tests). Next step of course will be to get them working on JRuby.

Unfortunately, while now my application is working fine with Ruby 1.9, Typo is still a no-go… reason? It still relies on Rails 3.0, which is not supported on 1.9 in Gentoo, mostly due to its dependencies. For instance it still wants i18n-0.5, which doesn’t work on 1.9, and it tries to get ruby-debug in (which is handled in different gems altogether for Ruby 1.9, don’t ask). The end result is that I’ve still not migrated my blog to the server running 1.9, and I’m not sure when and if that will happen, at this point.. but things seem to fall into place, at least a bit.

Hopefully, before end of the summer, Ruby 1.9 will be the default Ruby interpreter for Gentoo, and next year we’ll probably move off Ruby 1.8 altogether. At some later point, I’d also like to try using JRuby for Rails, since that seems to have its own advantages — my only main problem is that I have to use JDBC to reach PostgreSQL, as the pg gem does not work (and that’s upsetting as that is what my symbol collision analysis script is using).

So, these are my Ruby 1.9 pains for now, I hope to have better news in a while.

Small talk about my experience with MongoDB

I’m interrupting the series of Ruby rants (somewhat) to talk about something that is slightly related but not too much. I’ve already written about my plan of writing a system to index and manage the boxes that I manage at various customers’ places. This system is half written for me to have something neater than GIT and HTML to manage the data, and half to get up-to-date with modern Rails development.

One of the decisions I made was to try for once a NoSQL approach. I had many questions on why I did it that way and the answer was for me pretty simple actually: I didn’t want to spend time in designing the relationships first and write the code later. The nice part about using MongoDB was that I’m able to add and remove attributes when I like, and still query for them without fetching huge amount of data to process Ruby-side.

Honestly, after seeing the kind of queries that Rails concocts to get me data that is normalised, but requires multiple many-to-many relationships to be resolved, I’m quite sure that it can’t be worse with MongoDB than it is with PostgreSQL, for this kind of data.

Of course it’s not all positive sides with MongoDB; beside the obnoxious requirement of a JavaScript engine (and there aren’t many), which is linked to the use of v8 (which is not ABI-stable, and thus each update is a rebuild and restart), I had some bad experience yesterday, but not something extreme, luckily. On the one server I use MongoDB on, I recently configured the locale settings in the environment — but I forgot to re-execute locale-gen so the locale was broken; as it turns out, Boost throws an exception in that case, and MongoDB does not try to manage it, aborting instead.

I guess the main blame here is on the init script that does not report an execution failure: the service is noted as started, and then crashed, which is technically correct, but not what you expect the init script to tell you. I guess if I have more time I should try to get more Unix-style daemon support to mongod so that it can be integrated better with our start-stop-daemon rather than left with the hacky script that it’s using now.

Add to that missing support for using the syslog protocol and you can probably figure out that the thing that worries me the most about MongoDB is the usual sense of “let’s ignore every previous convention” which seems to come with NoSQL projects. Luckily at least this project feels like technically, rather than politically, driven, which means I should be able to get them to at least partially implement those features that would make it a “good citizen” in an Unix environment.

Sigh, I really wish I had a bit more time to hack at it, since I know the few rough spots I found should be easily polished with a week or so of work; unfortunately I guess that’ll have to wait. Maybe after my coming US trip I’ll be able to take a couple of weeks to dedicate to myself rather than customers.

Patching up a monster of frankeinsteinian proportions

I’ve spent the first week of the year on vacation with some friends. The second week of the year has been mixed between going on with the jobs I should have gotten working already, fighting a bad case of cold, and getting insulted by a customer of mine for actually having gotten real vacation time for once in two years. More to the point: said customer doesn’t actually pay me overtime, or actually at all for the support.

Tonight I wanted to relax and think about my own needs. Not personal needs, alas, but at least needs for my work to become easier. Since I haven’t made any progress at all regarding RT I decided to look into a different need of mine: cataloguing customers’ computers.

I originally simply kept a file listing the computers I set up for customers — then I started getting more customers, and sometimes getting a computer back after many months since last time. And I started forgetting which computer was which. Nowadays I have 79 computers on my “database” (which is just a git repository with a bunch of HTML files as well as lshw dumps), without counting those that have been dismissed.

To recognise the computers, I started printing labels with a QR Code on them, which contains the URL of the computer’s HTML file on my website (password-protected). My original method required me to feed a multi-label A4 sheet into my laser printer and print one, two or three labels out on that… but it turned out to be a waste of time and of money in sheets, given that most of the time I ended up wasting half of it, as the printer refused to print aligned more than half the time. I’ve since bought a Dymo label printer, which is why you’ll find their drivers in Portage maintained by yours truly — the nice thing about Dymo’s label printers is that their drivers are fully GPL-2, while as far as I can tell both Zebra and Brother have binary blobs, that make them unsuitable for use on amd64-based systems.

As you can tell, there are a few things that I did in Gentoo that relate to this little “database” of mine: the lshw fixes to try getting it back into SysRescueCD (it’s still not there — and I lost the password for my account on their forums), the Dymo drivers noted above, and dev-ruby/barby which is a quite interesting library that allows you to generate almost any kind of barcode. And now it’s time of MongoDB Ruby libraries as I’m trying to write an actual web application to manage the “database” and make it a real database.

Today’s achievement is big: I finally got Rails (3.1) to play nice with MongoDB. Not using MongoMapper, the author of which, as I already talked about I would prefer not having much to discuss with. But thanks to Mauro I got pointed at Mongoid which is a much more well developed alternative.

Okay sure there are quite a few things to kink out in the packaging of Mongoid – for instance the fact that the gem packages a Rakefile that relies on a (missing) Gemfile, or the fact that two out of three rspec targets in said Rakefile fail, one of which by crashing the interpeter – but at least their unit-tests work, and the code works as intended when loaded it up. Which is more I can say about MongoMapper.

Oh and it doesn’t seem to require extra code to be added just to work correctly with Passenger.

The only problem I have now is fixing up one side issue: how do I print the labels once I load this into my webserver? I could download the PDF I use to print the label and then print that.. but it’s a bit of a time-waster. Of course both the server and Yamato (where the label printer is connected) are IPv6-enabled and .. well, the IPP protocol used by CUPS is fine to be used over the internet, as it can use SSL encryption. Which yes, means that I’ll be setting up a web application … that calls home to print a label, how crazy is that?

My only issue with this is that I’d rather not install cups on the webserver (especially since there is currently no way to just build the client side of it, which would be the only part of it I would need on the server — yeah I know, it’s funky), so I can’t just call lpr mylabel.pdf… and as far as I can tell, the only way to access IPP from Ruby is one of the many CUPS library bindings available as gems, which are all 0.0 versions, and do not inspire me the least. Since IPP is based off HTTP, I would have expected more implementations of it, to be honest.

Possibly, it should be possible to extend some HTTP Ruby library to send IPP requests as well; for what I’m concerned, I’d just need the “Print-Job” method to be implemented, which would allow me to send the PDF file to be printed with the default options. I guess I’ll resolve that bit once I’m done with the rest of my application, though.

Gems make it a battle between the developer and the packager

It is definitely not a coincidence that whenever I have to dive into Gentoo Ruby packaging I end up writing a long series of articles for my blog that should have the tag “Rant” attached, until I end up deciding that it’s not worth it and I should rather do something else.

The problem is that, as I said many times before (and I guess the Debian Ruby team agrees as well), the whole design of RubyGems makes it very difficult to package them properly, and at the same time provides the developers with enough concepts to make the packaging even more tricky than it would by merely due tot he format.

As the title says, for one reason or another, RubyGems’s main accomplishment is simply to put extensions’ developers and distributions’ packages one against the other, with the former group insisting on doing things “fun”, and the latter doing things “right’. I guess most of the members of the former group also never tried managing a long term deployment of their application outside of things like Heroku (that are paid to take care of that).

And before somebody tells me I’m being mean by painting the developers puny with their concept of fun, it’s not my fault if in the space of an hour after tweeting a shorter version of the first paragraph of this post, two people told me that “development is fun”… I’m afraid for most people that’s what matters, it being fun, not reliable or solid…

At any rate… even though as we speak nobody expressed interest (via flattr) on packaging of the Ruby MongoDB driver that I posted about yesterday, I started looking into it (mostly because I’m doing another computer recovery for a customer and thus I had some free time in my hands while I waited for antivirus to complete, dd_rescue to copy data over, and so on so forth).

I was able to get some basic gems for bson and mongo working, which were part of the hydra repository I noted, but the problems started when I looked into plucky which is the “thin layer” used by the actual ORM. It is not surprising that this gem also is “neutered” to the point of being useless for Gentoo packaging requirements, but there are more issues. First of all it required one more totally new gem to be packaged – log_buddy which also required some fixes – that is not listed in the RubyGems website (which is proper, if you consider that the tests are not executable from the gem file), but most importantly, it relied on the matchy gem.

This is something I already had to deal with, as it was in another long list of dependencies last year or the one before (I honestly forgot). This gem is interesting: while the package is dev-ruby/matchy, it was only available as a person-specific gem in Gemcutter: jnunemaker-matchy and mcmire-matchy; the former is the original (0.4.0), while the latter is a fork that fixed a few issues, among which there was the main problem: jnunemaker-matchy is available neither as a tarball nor as a git tag.

For the package that originally required matchy for us (dev-ruby/crack), mcmire’s fork worked quite well, and indeed it was just a matter of telling it to use the other gem for it to work. That’s not the case for plucky, even thought jnunemaker didn’t release any version of matchy in two years, it only works with his version of matchy. Which meant packaging that one as well, for now.

Did I tell you that mcmire’s version works with Ruby 1.9, while jnunemaker’s doesn’t? No? Well, I’m telling you now. Just so you know, almost in 2012, this is a big deal.

And no, there is not a 0.4.0 yet. Two years after release. The code stagnated since then.

Oh and plucky’s tests will fail depending on how Ruby decides to sort an Hash’s keys array. Array comparison in Ruby is (obviously) ordered.

Then you look at the actual mongo_mapper gem that was the leaf of the whole tree.. and you find out that running the tests without bundler fixing the dependencies is actually impossible (due to the three versions of i18n that we have to allow side-installation of). And the Gemfile, while never declaring dependencies on the official Mongo driver (it gets it through plucky), looks for bson_ext (the compiled C extension, that in Gentoo was not going to exist, since it’s actually installed by the same bson package — I’ll have to create a fake gemspec for it just so it can be satisfied).

And this actually brings us to a different problem as well: even though plucky has been updated (to version 0.4.3) in November, it still requires series 1.3 of the Mongo driver. Version 1.4.0 was released in September, and we’re at version 1.5.2.

And I didn’t name SystemTimer gem, which is declared a requirement during development (but not by the gem of course, since you’re not supposed to run tests there) only for Ruby 1.8 (actually only for mri18, what about Ruby EE?) which lacks an indication of a homepage in the RubyGems website….

I love Ruby. I hate its development.

Gems using hydra development model

As I said on twitter I think I have a part of me that’s pretty much a masochist, as I’ve had the nice idea of trying out MongoDB for a personal experiment (a webapp to manage clients’ hardware configuration registration, stuff for my usual “day job”), so I started looking into packaging Ruby (and Rails) support for it.

Luckily Rails 3 started abstracting the ORM as well, allowing an almost drop-in replacement of ActiveRecord with a MongoDB adapter instead.. there is a nice guide that actually tells you almost all that is needed in the basic case (of course it could be more automated, but that’s beside the point here). The core of that guide is the mongo_mapper gem which then depends on a thin layer that is further built on top of the driver depending in turn on bson that provides a pure Ruby interface as well as a JRuby version… there is a separate gem for the C-based extension … if your head is spinning or are wondering if I became a linkspammer, I can’t blame you.

The end result is that at the very least I got to package

  • bson
  • mongo
  • plucky
  • mongo_mapper

Not too shabby, but I got in way worse situations before, such as the dependency web of Bones and its extensions. What I didn’t expect was the way these packages are developed rather than packaged.

The gems, as usual, are shallow, and contain no tests, Rakefile, or any other useful files that we need to package them properly as ebuilds. So I had to rely on the GitHub repositories, thankfully we do that mostly transparently in Ruby eclasses (as it’s way too common for us to have to rely on snapshots); unfortunately while for plucky and mongo_mapper the situation was clear (the homepage of the gems points to GitHub, and the two have separate repositories), the situation gets complex for the mongo driver and related gems.

First, let’s ignore the issue with bson and bson_ext (the C-based extension)… that is stuff we would generally can deal with in Gentoo itself (the C extension will be built for all the targets supporting it; JRuby will get its own extension, and all in the same package). The problem is that, after digging through the MongoDB website – finding the list of repositories is definitely not easy – the repository for both mongo and bson (and bson_ext!) is the same … and not in the way that Rails 2.3 was all in the same repositories (with each gem having its own subdirectory), but merging the content of the three gems in the same structure. Saying that it’s messy is not enough.

It is no doubt to me that I’ll handle that just fine at some point in the future, but it really makes me wonder how is it possible for them to consider this a good design and development practice. It’s a mystery. For now I don’t think I’ll spend much more time on this issue as I have other tasks for my job to take care of, which moves this to the back-burner…

And on that topic I’d like to try again an experiment to gauge the interest on packaging these gems; my blog is Flattr] enabled – even though lately AdSense on this and Autotools Mythbuster is getting me more money than Flattr – and this post is as well. If you’ve got an account (or feel like opening one), and you’re interested in seeing ebuilds for the Mongo Ruby driver in Gentoo, give a flattr to this post. If I see that count increasing in the next two days I’ll use the Christmas weekend to work on it.