The Mono problem

In the past I have said that I find C# a very nice language; I still maintain this position, the C# language, taken alone, is probably one of my favourites, this does not mean that I intend to rewrite my ruby-elf suite in C#, but I like it and if I were to have to write a GUI application for the GNOME desktop, not relying on any particular library, I’d probably choose C# and Mono.

Why do I say “not relying on any particular library”? Well, today I wanted to try writing some C# bindings for PulseAudio to make use of it for a project I was proposed (and I don’t think now is really feasible), and I went to read the documentation from the Mono project. Took me a while to digest it; even though I had some experience with writing bindings for Ruby with my “Rust”, and a long time ago I worked on implementing Python extensions to a program, the C# method of bindings software really escaped me for quite a while.

In all the interpreted languages I know to write bindings you start from the language the interpreter is written in (usually, C) and then define an interface to the language that calls into “native” functions that in turn can call the library you want to bind. This is how Ruby bindings are written, and how Rust works, it tells the Ruby interpreter that there is a class called in a certain way and it defines its method through C functions that are called back; then it takes care of marshalling and translating parameters around.

The C++ bindings work in a slightly different way: since C++ can be described as a superset of C from one point of view, and the design of the language allows a very high type compatibility between the two, included the calling conventions, you usually write C++ class interfaces that wrap around C functions calling. It’s a completely different paradigm than Ruby and Python, but it works because there is really no interpreter or library barrier between C and C++ after all.

Considering how Mono is implemented, I sincerely expected the bindings writing to be a mix of these two methods; it seems instead that it’s almost entirely a C+­+-style bindings interface. But with a nasty twist: in C+­+ you got direct access to the interface of the library you’re writing bindings of (its headers) through direct inclusion and an eventual extern "C" block; with C# you don’t have that at all, as far as I can see.

This means that you got to describe all the interfaces inside the C# code, and then write the marshalling code that can translate the parameters from C# objects to C types. The way the functions are loaded is similar to the standard dynamic linker interface (dlopen()) with all the problems connected to that: C++ interfaces are almost impossible to load, and you got to get the parameters just right, if you don’t it’s a catastrophe doomed to happen. And this is even trickier than linking libraries with pure dynamic linking.

The most obvious problem for those who had to deal with dlopen() idiosyncrasies, is that C# has fixed-sized basic integer types. This is good, but not all software uses fixed-sized parameters; off_t, size_t, long are all types that change their size depending on the operating system and the hardware architecture; off_t is not even of the same size on the same system because it depends on whether large file support is enabled or not, at least on glibc-based systems (most BSDs should have it fixed-sized but that’s about it). Since the C# code is generic and is supposed to be built just once, it’s not easy to identify the right interface for the system. You cannot just #ifdef it around like you would with C++ bindings.

But this is not the only problem; the one problem I noticed first is, again because of the lack of access to the #include headers, that constants might not be constant. Since I wanted to write bindings for PulseAudio, I started first with the “simple” interface, and I started finding the problem right away with the pa_stream_direction_t enumeration. While I could create my own C# enumeration for it, I have no guarantee that Lennart does not decide to change the values; while that is going to change the ABI of the package, for both native implementations and Ruby-style bindings, a rebuild is just enough, there is usually no need to change the sources when this kind of changes are made; for the C# bindings, you’d have to adapt the bindings every time.

This is probably why there aren’t many C# bindings for libraries that don’t use GObject (for that, you got the gapi tool that takes care of most of the work for you), and why the banshee project preferred to reimplement TagLib in C# rather than bind it (indeed, binding TagLib is far from an easy thing, like I can testify first hand). I’m afraid this is the “Achilles’ heel” of C# and Mono. While this makes it less likely to produce the “java crap effect” that I have written about almost four years ago by now (jeez has so much time passed?), it does reduce the ability of Mono to blend in with the rest of the modern Linux systems.

The effort required to maintain proper bindings for C projects in C# is even higher than it is to reimplement the same code, and that is really a big problem for blending; the only thing that it works well for is portability, especially when it comes to portability on Microsoft platforms. This is all fine and dandy if you need your software to bend that way, and I have to say I do know a couple of cases where that might be one of the important factors, but it comes to a pretty high toll. On the other hand, Ruby, Python, Vala and Java seem to have better chances for integration. All in all, I’m sincerely unimpressed. I like the language, I just don’t like the runtime or the way that’s going.

9 thoughts on “The Mono problem

  1. I still wonder why you don’t give up with mono and play just with vala if you want just the, arguably ugly, syntax and not the, arguably quirky, runtime.Still I find both mono/C# and jvm/java pretty much the same…

    Like

  2. A few points:1) Mono didn’t choose this binding interface. It was invented by Microsoft. Mono uses it because they want source compatibility with MS.Net even for P/Invoke-using code. Mono has an alternative binding system, too, which works like the binding systems for most dynamic libraries. It’s mostly used in low-level library code, though. (mono_add_internal_call)2) SWIG can generate C# bindings. I have no idea of their quality, though.

    Like

  3. Sebastian, yes indeed it’s Microsoft’s, but I sincerely expected or at least hoped that it had an alternative method (as you say it has, but I haven’t seen documented).It’s quite obvious that the problems I listed here are not much of a concern for Microsoft, because of the way software works in Microsoft’s own land: there are at best two architectures (i386 and “x64”) and most of the types have the same size on both anyway; they tend to have more stable ABIs too especially because you don’t export everything by default.As for SWIG, I don’t really want to see it any day soon; I tried it for Ruby bindings but it has basically no support to produce a _proper_ object-oriented interface; which makes it pretty useless for mostly object oriented languages like Ruby and C#.

    Like

  4. A lot of the native code MS wanted to run/script from .net was COM, and .net has good support for auto-generating COM bindings. I wonder whether that glue generating thing can work with non-COM interfaces expressed using IDL. Or maybe I don’t know what I’m talking about.

    Like

  5. Nice catch about COM; I guess the point is, like GObject, that they provide enough introspection to be able to deal with them in a programmatic way.Petteri, last I knew Smoke was _very_ tight-fitted around Qt; I considered hacking that for writing the Ruby bindings for TagLib but soon gave up. Of course that was a couple of years ago with KDE 3.

    Like

  6. There is no easy solution to getting types from the system, short of pre-processing and parsing the data structures to figure out what exactly you meant by “off_t”.But if you *really* want that, you can build that yourself. It just does not belong in the language.

    Like

  7. I agree it does not belong to the language, which is exactly why I was expecting something “external” to handle that from the outside. And I understand pretty well the complexity of finding out from inside C# what the C @off_t@ type means.On the other hand, this is probably a huge obstacle on Mono adoption, and finding a solution, whatever that is, might be a good way to win more developers…

    Like

  8. Someone else already mentioned SWIG, so just wanted agree. SWIG works great for both C and C++ bindings, for .NET,Mono,java, etc. Definitely worth learning.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s