Okay, here I am ranting once again about Ruby. And yes, I do rant a lot about Ruby even though I like the language a lot; my beef is usually (and in this case as well) about the various implementations.
For one of my job tasks right now I have to parse some CSV file and produce a series of graphs; while any language would work just as fine to produce this kind of graphs, I wanted to use Ruby because that’s my usual language of choice, and it also provides all the needed interfaces and libraries I needed, or so I thought. Reality smacked me down hard.
First problem: the speed. I know already that Ruby processing isn’t exactly fast; luckily there’s JRuby to solve that, usually. With the exception of the startup overhead, JRuby can process data much faster, and thus would be the candidate of choice for producing some less rough cut of the data, so my first choice was to use JRuby to read and condition the input. I had, though, the bad idea of wanting to try fastercsv to read the data, which as the name suggest should be faster (given it’s going to handle over 20MB of raw data, it seemed like a good idea)… an ebuild (using ruby-ng of course) later, I discover that the testsuite of the library fails under JRuby (I guess I should report it but I’m not sure to who).
Okay, second choice: Ruby 1.9, which is faster than 1.8 in many ways, and includes fastercvs bundled in the standard library. At that point, I thought I could also draw the graphs directly in Ruby, with gruff, which works quite nicely. Unfortunately gruff requires rmagick, which is not ported to Ruby 1.9 at least in Gentoo (the ebuild is probably too complex to write supporting both with the current infastructure; maybe porting to ruby-ng works, but I’ll have to find more time to try that as well).
Finally, I’m left with Ruby 1.8 that could do what I want but I’m sure will be the slowest option out there, and I’m not sure I want to try it. I really hope that Hans and Alex can find time to help me validating the new ebuilds, so that we can actually move to use those, maybe that will give enough support for Ruby 1.9, and for JRuby, and we could finally get to have a single implementation with all the feature people are going to need (speed and library support).
I don’t see what you’re trying to say. A test fails under jruby and gentoo has some problems with ruby.How does this live up to “something is fundamentally wrong with ruby”?
For drawing the graphs you could take a look at one of the Google Charts ruby wrappers around.
I know exactly how you feel. For my job, I’ve recently worked on a script that uses FasterCSV but takes an hour and a half to run. I wanted to use JRuby or 1.9 but the script is part of a Rails project with very many dependencies. Even if I could convince it to only run with compatible dependencies, it wouldn’t be worth the time and effort. I’ll have to put aside some time for a proper migration. I also tried using threads (I have a quad core) and wondered why it actually took longer, shortly before remember that 1.8 can only run on a single core. Bugger.
@Betelgeuse I’d rather not depend on Google for this kind of things sincerely, it really goes against my way of thinking :(@James Ruby 1.9 also runs on a single core _most_ of the time, it can only multithread as long as there are I/O-bound threads, or threads running code from extensions (and thus not interpreted). Only JRuby is properly multithreaded
In my experience, nothing really beats the speed of Ada for processing large input quickly. A compiled language designed from the beginning with correctness in mind. The only typesafe language understood by GCC.