The Mono problem

In the past I have said that I find C# a very nice language; I still maintain this position, the C# language, taken alone, is probably one of my favourites, this does not mean that I intend to rewrite my ruby-elf suite in C#, but I like it and if I were to have to write a GUI application for the GNOME desktop, not relying on any particular library, I’d probably choose C# and Mono.

Why do I say “not relying on any particular library”? Well, today I wanted to try writing some C# bindings for PulseAudio to make use of it for a project I was proposed (and I don’t think now is really feasible), and I went to read the documentation from the Mono project. Took me a while to digest it; even though I had some experience with writing bindings for Ruby with my “Rust”, and a long time ago I worked on implementing Python extensions to a program, the C# method of bindings software really escaped me for quite a while.

In all the interpreted languages I know to write bindings you start from the language the interpreter is written in (usually, C) and then define an interface to the language that calls into “native” functions that in turn can call the library you want to bind. This is how Ruby bindings are written, and how Rust works, it tells the Ruby interpreter that there is a class called in a certain way and it defines its method through C functions that are called back; then it takes care of marshalling and translating parameters around.

The C++ bindings work in a slightly different way: since C++ can be described as a superset of C from one point of view, and the design of the language allows a very high type compatibility between the two, included the calling conventions, you usually write C++ class interfaces that wrap around C functions calling. It’s a completely different paradigm than Ruby and Python, but it works because there is really no interpreter or library barrier between C and C++ after all.

Considering how Mono is implemented, I sincerely expected the bindings writing to be a mix of these two methods; it seems instead that it’s almost entirely a C+­+-style bindings interface. But with a nasty twist: in C+­+ you got direct access to the interface of the library you’re writing bindings of (its headers) through direct inclusion and an eventual extern "C" block; with C# you don’t have that at all, as far as I can see.

This means that you got to describe all the interfaces inside the C# code, and then write the marshalling code that can translate the parameters from C# objects to C types. The way the functions are loaded is similar to the standard dynamic linker interface (dlopen()) with all the problems connected to that: C++ interfaces are almost impossible to load, and you got to get the parameters just right, if you don’t it’s a catastrophe doomed to happen. And this is even trickier than linking libraries with pure dynamic linking.

The most obvious problem for those who had to deal with dlopen() idiosyncrasies, is that C# has fixed-sized basic integer types. This is good, but not all software uses fixed-sized parameters; off_t, size_t, long are all types that change their size depending on the operating system and the hardware architecture; off_t is not even of the same size on the same system because it depends on whether large file support is enabled or not, at least on glibc-based systems (most BSDs should have it fixed-sized but that’s about it). Since the C# code is generic and is supposed to be built just once, it’s not easy to identify the right interface for the system. You cannot just #ifdef it around like you would with C++ bindings.

But this is not the only problem; the one problem I noticed first is, again because of the lack of access to the #include headers, that constants might not be constant. Since I wanted to write bindings for PulseAudio, I started first with the “simple” interface, and I started finding the problem right away with the pa_stream_direction_t enumeration. While I could create my own C# enumeration for it, I have no guarantee that Lennart does not decide to change the values; while that is going to change the ABI of the package, for both native implementations and Ruby-style bindings, a rebuild is just enough, there is usually no need to change the sources when this kind of changes are made; for the C# bindings, you’d have to adapt the bindings every time.

This is probably why there aren’t many C# bindings for libraries that don’t use GObject (for that, you got the gapi tool that takes care of most of the work for you), and why the banshee project preferred to reimplement TagLib in C# rather than bind it (indeed, binding TagLib is far from an easy thing, like I can testify first hand). I’m afraid this is the “Achilles’ heel” of C# and Mono. While this makes it less likely to produce the “java crap effect” that I have written about almost four years ago by now (jeez has so much time passed?), it does reduce the ability of Mono to blend in with the rest of the modern Linux systems.

The effort required to maintain proper bindings for C projects in C# is even higher than it is to reimplement the same code, and that is really a big problem for blending; the only thing that it works well for is portability, especially when it comes to portability on Microsoft platforms. This is all fine and dandy if you need your software to bend that way, and I have to say I do know a couple of cases where that might be one of the important factors, but it comes to a pretty high toll. On the other hand, Ruby, Python, Vala and Java seem to have better chances for integration. All in all, I’m sincerely unimpressed. I like the language, I just don’t like the runtime or the way that’s going.

The destiny of Ruby-Elf

After my quick last post I decided to look more into what I could do to save the one year of work I pushed into Ruby-Elf. After fighting a bit to understand how the new String class from Ruby 1.9 worked, I could get the testsuite to pass on Ruby 1.9, and cowstats to also report the correct data.

Unfortunately, this broke down Ruby 1.8 support, as far as I can see because IO#readpartial does not work that well on 1.8 even for on-disk files; similarly happens to JRuby.

After getting Ruby 1.9 to work the obvious next task was to make cowstats work with multiple threading. The final idea was to have a -j parameter akin to make’s but for now I only wanted to create one thread per file to scan. In theory, given native threading, all the threads would be executing at once, scheduled by the system schedule, allowing to saturate the 8 cores, reaching 800% of CPU time as a theoretical maximum.

Unfortunately reality soon kicked in, and the ruby19 process limited itself to 100%, which means a single core out of eight, which also means no parallel scan. A quick glance through the sources shows that while YARV (the Ruby 1.9 VM) lists three possible methods to achieve mutlithreading, only one is currently implemented, the second one. The first method is the old one, green threading, which basically means simulated threads, as the code never executes in parallel but uses an event-loop-like construct to switch the execution between different “threads”. The second method makes use of a giant lock, which in this case is called Giant VM Lock (GVL), and is called GIL (Giant Interpreter Lock) in Python, where the threads are scheduled by the operating system, which allows for more fair scheduling among execution threads, but still allows just one thread per VM to be executed in parallel. The third method is the one I was hoping for and allows for multiple threads to be executed at the same time on different cores on the same virtual machine; instead of having a single lock on the whole VM, the locks are sparse around the code to just lock the needed resources for each thread.

I also checked this out on JRuby, to compare; unfortunately JRuby in portage cannot handle the code as I changed it to work with Ruby 1.9, so I have been unable to actually benchmark a working run of cowstats with it; but I could see that the CPU used by JRuby spiked at 250%, which means it at least is able to execute the threads quite independently; which proves that Ruby can be parallelised up to that point just fine.

So what is the fuss about Ruby 1.9 new native threading support if multiple threads cannot be executed in parallel? Well it still allows for a single process to spawn multiple VMs and execute parallel threads on them, isolated one from the other. Which happens to be useful for Ruby on Rails web applications. If you think well about it, the extra complexity added to deal with binary files is also to address some interesting problems that come up in environment where multiple encodings can often be used, which is, web applications. Similarly the JRuby approach, which is very fast once the JVM is loaded, works fine for applications where you start up once and then proceed to elaborate for a long time, which again fits web application and little more.

I’m afraid to say that what we’re going to see in the next and not-so-next future is for Ruby to lose the general-purpose support and just focus more and more on the web application side of the fence. Which is sad since I really cannot think of anything else I would like to rewrite my tools in, beside, maybe, C# (if it could be compiled in ELF — I should try Vala for that). I feel like my favourite general-purpose language is slipping away, and I should stop worrying and working on that.

The end of Ruby-Elf?

One thing that always bothered me of Ruby-Elf and its tools (cowstats, the linking collision script and the rest) is that they don’t really make good use of the 8-way system I have as main workstation, which is not really good considering that it also means that the cowstats run after each emerge in my main system blocks on a single core, and I don’t even want to try it on tinderbox as a whole. It also means I cannot replace scanelf with a similar script in Ruby (neither are parallelised but the C-based scanelf is obviously faster).

To address this problem, I considered moving to JRuby as interpreter; it’s already using native threading with the 1.8 syntax, and it would have been decently good to get cowstats multithreaded, the problem is that the startup time is considerable, which wasn’t very good to begin with. So I decided to bite the bullet and try Ruby 1.9 to see how it performed.

Beside some slight syntax change, I started already having problems with Ruby 1.9 and Ruby-Elf. The testsuite is still written using Test::Unit, because RSpec didn’t suit my needs well at all, and for that reason I prepared an ebuild (masked for now; remember I hate overlays unless very necessary, I’ll go deeper inside on that issue in the next weeks hopefully) for the improved test-unit extension (not using gem, as usual). It should work with Ruby 1.8 too, although I found some test failures on the test framework itself with both 1.9 and 1.8.

The following problem has been with the readbytes.rb interface I am using, which is gone in 1.9, stating that the interface is implemented already in the IO class. Unfortunately the only interface that gets near is IO#readpartial but it’s not actually the same thing and has a quite different meaning in my opinion, but again, let’s not get anal about that, it could be fixed quite easily.

What became a big problem are the changes in the String class, which is now encoding-aware, and it expects each elements in it to be a character. While this is tremendously good since String would then work more like a String than a character array (like is done in the underlying C language), it lacked a parallel ByteArray object for handling sequences of bytes, and a binary file interface into IO. This is a very huge deal because the whole of Ruby-Elf is doing little more than binary file parsing.

Now to be honest I didn’t spend too much time running through the changelogs of Ruby 1.9 to identify all the changes, but since the changes seem to be quite huge by themselves, and I could even get simpler binary file handling in PHP than what I’ve seen in the two hours I spent trying to force Ruby 1.9 under submission, I’m afraid to say that Ruby-Elf is going to stagnate and I’ll end up looking at a different language to implement the whole thing.

Luca suggested (as usual) for me to look at Python, but I don’t really like Python that much myself, while the forced indentation may help to make the code more readable, take a look at Brian’s (ferringb) code and you will never say that Perl is the only SSL-based language (sorry Brian, but you know that I find your code quite encrypted). I’m sincerely considering the idea of moving to C#, given that the whole Mono runtime is adding less startup overhead than Java/JRuby would.

Or I could go for the alternative route and just try to write it in Objective C.

For A Parallel World. Case Study n.6: parallel install versus install hooks

Service note: I start to fear for one of my drives, as soon as my local shop restocks the Seagate 7200.10 drives I’ll go get two more to replace the 250GB ones and put them under throughout tests.

I’ve already written in my series about some issues related to parallel install. Today I wish to show a different type of parallel install failure, which I found while looking at the logs of my current tinderbox run.

Before starting, though, I wish to explain one thing that might not be tremendously obvious to most people not used to work with build systems. While the parallel build failures are most of the time related to non-automake based buildsystems, which fail to properly express dependencies, or in which the authors mistook a construct for another, the parallel install failures are almost always related to automake. This is due to the fact that almost all custom-tailored buildsystems don’t allow parallel install in the first place. For most of them, the install target is just one single serial rule, which always works fine even when using multiple parallel jobs, but obviously slows down modern multicore systems. As automake supports parallel install targets, which makes it quite faster to install packages, it also adds the complexity that can cause parallel build failures.

So let’s see what the failure I’m talking about is; the package involved is gmime, with the Mono bindings enabled; Gentoo bug #248657, upstream bug #567549 (thanks to Jeffrey Stedfast, who quickly solved it!). The log of the failure is the following:

Making install in mono
make[1]: Entering directory `/var/tmp/portage/dev-libs/gmime-2.2.23/work/gmime-2.2.23/mono'
make[2]: Entering directory `/var/tmp/portage/dev-libs/gmime-2.2.23/work/gmime-2.2.23/mono'
make[2]: Nothing to be done for `install-exec-am'.
test -z "/usr/share/gapi-2.0" || /bin/mkdir -p "/var/tmp/portage/dev-libs/gmime-2.2.23/image//usr/share/gapi-2.0"
test -z "/usr/lib/pkgconfig" || /bin/mkdir -p "/var/tmp/portage/dev-libs/gmime-2.2.23/image//usr/lib/pkgconfig"
/usr/bin/gacutil /i gmime-sharp.dll /f /package gmime-sharp /root /var/tmp/portage/dev-libs/gmime-2.2.23/image//usr/lib
 /usr/bin/install -c -m 644 'gmime-sharp.pc' '/var/tmp/portage/dev-libs/gmime-2.2.23/image//usr/lib/pkgconfig/gmime-sharp.pc'
 /usr/bin/install -c -m 644 'gmime-api.xml' '/var/tmp/portage/dev-libs/gmime-2.2.23/image//usr/share/gapi-2.0/gmime-api.xml'
Failure adding assembly gmime-sharp.dll to the cache: Strong name cannot be verified for delay-signed assembly
make[2]: *** [install-data-local] Error 1
make[2]: Leaving directory `/var/tmp/portage/dev-libs/gmime-2.2.23/work/gmime-2.2.23/mono'
make[1]: *** [install-am] Error 2
make[1]: Leaving directory `/var/tmp/portage/dev-libs/gmime-2.2.23/work/gmime-2.2.23/mono'

To make it much more readable, the command and the error line in the output are the following:

/usr/bin/gacutil /i gmime-sharp.dll /f /package gmime-sharp /root /var/tmp/portage/dev-libs/gmime-2.2.23/image//usr/lib
Failure adding assembly gmime-sharp.dll to the cache: Strong name cannot be verified for delay-signed assembly

So the problem comes from the gacutil program that in turns comes from mono, which seems to be working on the just-installed file. But was it installed? If you check the complte log above, there is no install(1) call for the gmime-sharp.dll file that gacutil complains about, and indeed that is the problem. Just like I experienced earlier, Mono-related error messages needs to be interpreted to be meaningful. In this case, the actual error should be a “File not found” over /var/tmp/portage/dev-libs/gmime-2.2.23/image//usr/lib/mono/gmime-sharp/gmime-sharp.dll.

The rule that causes this is, as make reports, install-data-local, so let’s check that in the mono/Makefile.am file:

install-data-local:
        @if test -n '$(TARGET)'; then                                                                   
          if test -n '$(DESTDIR)'; then                                                         
            echo "$(GACUTIL) /i $(ASSEMBLY) /f /package $(PACKAGE_SHARP) /root $(DESTDIR)$(prefix)/lib";                
            $(GACUTIL) /i $(ASSEMBLY) /f /package $(PACKAGE_SHARP) /root $(DESTDIR)$(prefix)/lib || exit 1;     
          else                                                                                          
            echo "$(GACUTIL) /i $(ASSEMBLY) /f /package $(PACKAGE_SHARP) /gacdir $(prefix)/lib";                        
            $(GACUTIL) /i $(ASSEMBLY) /f /package $(PACKAGE_SHARP) /gacdir $(prefix)/lib || exit 1;             
          fi;                                                                                           
        fi

So it’s some special code that is executed to register the Mono/.NET assembly with the rest of the code, it does not look broken at a first glance, and indeed this is a very subtle build failure, because it does not look wrong at all, unless you know automake enough already. Although the build logs helps you a lot to find this out.

The gmime-sharp.dll file is created as part of the DATA class of files in automake, but install-data-local is not depending on them directly, its execution order is not guaranteed by automake at all. On the other hand, the install-data-hook rule is called after install-data completed, and after DATA is actually built. So the solution is simply to replace -local with -hook. And there you go.

Next…

Working with .NET, OpenOffice, and Mono

So I did hint as a work having to do with Mono I was commissioned. It’s actually a quite simple thing that by itself has no relationship with Mono, to be honest.

Basically, this friend of mine is self-employed, and sometimes has to prepare a budget for customers, to tell them how much a given service would cost them. I originally wrote such a software for him a few years ago, using Java, producing an HTML page that could be printed for the customers, and archived in a Firebird database. Unfortunately he lost the database and decided to not go with that since he preferred having the page published up in a different way to file down.

After an year without my software he decided that yeah it’s a good idea to have a database for this, and asked me to change the software around. Since he now runs on Vista, I’m tempted to use something different this time, something that also has a more suitable look and feel, which is one thing he complained about before, since the Java application didn’t look native. Of course the choice here is .NET, and as for how to handle the printed result, OpenOffice seems to have everything I need, I just need to generate a document with the right fields filled in, and then ask OpenOffice to convert it to PDF for filing or mailing, and so on so forth. The good thing is that he’s already using OpenOffice.

Now I just need to find how to interface with .NET (and Mono on Linux for development) to OpenOffice, and see if I can actually command it like I need it to. The alternative would be to create macros directly in OpenOffice, but I guess it’d be a bit of a mess for me to write code with that language, while C# I can work with quite nicely nowadays.

I guess that if I can make it generic enough, it’d be a nice thing to have around as Free Software, at least as a sample, I guess.

No more breaks

Since this week I’ve been doing lots and lots of work on lscube, of which I wanted to blog but I’m afraid I won’t have much time to do that right now, I wanted to take a break this weekend, to relax, rest, and play a bit. Nothing went like I wanted, and I actually think I won’t try to take again a break like this anytime soon.

Yesterday almost went like I wanted, I worked a bit on my ruby-bombe project, removed the pending notice on some specs actually implemented some functions on it, although it’s still unable to read. I set up the SysRescueUSB key but then i had to finish a support request on a Vista laptop. Could be worse, but Vista by itself is a problem for me: I cannot connect Vista laptops to my router directly, they kill it! I have to connect them through one of my laptops (my mother’s or mine) or through Yamato. I have no idea why. Okay no problem. I finished the cleanup, and the laptop was ready to go.

Today, I was woken up very early since I was to be at home by myself, which also called for a good morning of relaxing playing with the PlayStation… yeah sure. First my brother in law dropped by with my nephew to pick up some of his tools that were left here, then I finally cleared out some tasks for a job I have to do, which called for testing, after lunch.

But while I finish set up the stuff for this job I also decide to set up another side job I was commissioned, of which I’ll try to blog in another moment since it’s really interesting to me and I might actually release it as Free Software afterward. For this job, though, I need Windows, since it’ll have to run on Vista. I have an XP license (not OEM, thus not tied to a box) but it was set up to work on the laptop, since that was the most powerful box I had at the time, and Enterprise was not VT-X capable so I wouldn’t run it there virtually; now the laptop is no longer the most powerful machine at home, and I don’t care much about playing on XP, (I only have two games running there and they should work fine on Wine nowadays), so I wanted to move it out, reclaim the free space on the laptop, and install it on VirtualBox. Unfortunately as soon as VirtualBox starts, Xorg crashes, and it does not wake up the video card when it starts back.

Two hours of fiddlings later I get to find out:

  • gnome-settings-daemon, updated today, does not always start properly; no clue why yet;
  • SDL applications were killing my Xorg, I noticed this with tuxtype before, and the same happened with VirtualBox;
  • the reason why Xorg couldn’t wake up the videocard was the framebuffer; since I don’t even use the framebuffer when SSH is not working (I love serial consoles) but rather the laptop so I can have cut and paste, I just disabled it and I live happier now;
  • the Xorg crash was in the VidMode calls;
  • the Xorg crash was already reported as FreeDesktop bug #17431;
  • the Xorg crash was caused by SDL bundling its own X11 libraries !

So once again not following to the letter the policy that tells us to always use system libraries wasted more of my time than it should have. I guess this is life for you.

On the other hand, I need Mono for the commissioned work I talked above, but Mono in Portage is currently very badly present, since compnerd hasn’t bumped anything in quite some time. Today I opened a few bugs around for Gtk# for instance, and I hope I’ll be able to bump a few things in my overlay and force-feed them to Portage in a week or two.

On a different note, I start to have some hardware cravings, in particular I currently have no free USB port, no Ethernet cable and just one USB to serial adapter left, this is very suboptimal. I guess as soon as my job pays me I’ll be getting a self-powered USB hub so that at least I can replicate a few extra ports. I have a crimping tool for Ethernet cables but it does not do Cat5E cable, and I wanted to buy Cat5e if I was to buy more new cables…

For A Parallel World. Case Study n.2: misknowing your make rules

Here comes another case study about parallel make failures and fixes. This time I’m going to write about a much less common, and more difficult to understand, type of failure. I have spotted and fixed this failure in gtk# (yes I have it installed).

Let’s see the failure to begin with:

Creating policy.2.4.glib-sharp.dll
Creating policy.2.4.glib-sharp.dll
Creating policy.2.4.glib-sharp.dll
ALINK: error A1019: Metadata failure creating assembly -- System.IO.FileNotFoundException: Could not find file "/var/tmp/portage/dev-dotnet/gtk-sharp-2.10.2/work/gtk-sharp-2.10.2/glib/policy.2.4.glib-sharp.dll".
File name: "/var/tmp/portage/dev-dotnet/gtk-sharp-2.10.2/work/gtk-sharp-2.10.2/glib/policy.2.4.glib-sharp.dll"
  at System.IO.FileStream..ctor (System.String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, Boolean anonymous, FileOptions options) [0x00000] 
  at System.IO.FileStream..ctor (System.String path, FileMode mode, FileAccess access, FileShare share) [0x00000] 
  at (wrapper remoting-invoke-with-check) System.IO.FileStream:.ctor (string,System.IO.FileMode,System.IO.FileAccess,System.IO.FileShare)
  at System.IO.File.OpenRead (System.String path) [0x00000] 
  at Mono.Security.StrongName.Sign (System.String fileName) [0x00000] 
  at System.Reflection.Emit.AssemblyBuilder.Save (System.String assemblyFileName, PortableExecutableKinds portableExecutableKind, ImageFileMachine imageFileMachine) [0x00000] 
  at System.Reflection.Emit.AssemblyBuilder.Save (System.String assemblyFileName) [0x00000] 
  at Mono.AssemblyLinker.AssemblyLinker.DoIt () [0x00000] 
ALINK: error A1019: Metadata failure creating assembly -- System.IO.IOException: Sharing violation on path /var/tmp/portage/dev-dotnet/gtk-sharp-2.10.2/work/gtk-sharp-2.10.2/glib/policy.2.4.glib-sharp.dll
  at System.IO.FileStream..ctor (System.String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, Boolean anonymous, FileOptions options) [0x00000] 
  at System.IO.FileStream..ctor (System.String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, Boolean isAsync, Boolean anonymous) [0x00000] 
  at System.IO.FileStream..ctor (System.String path, FileMode mode, FileAccess access) [0x00000] 
  at (wrapper remoting-invoke-with-check) System.IO.FileStream:.ctor (string,System.IO.FileMode,System.IO.FileAccess)
  at System.Reflection.Emit.ModuleBuilder.Save () [0x00000] 
  at System.Reflection.Emit.AssemblyBuilder.Save (System.String assemblyFileName, PortableExecutableKinds portableExecutableKind, ImageFileMachine imageFileMachine) [0x00000] 
  at System.Reflection.Emit.AssemblyBuilder.Save (System.String assemblyFileName) [0x00000] 
  at Mono.AssemblyLinker.AssemblyLinker.DoIt () [0x00000] 
Creating policy.2.6.glib-sharp.dll
Creating policy.2.6.glib-sharp.dll
Creating policy.2.6.glib-sharp.dll
ALINK: error A1019: Metadata failure creating assembly -- System.IO.IOException: Sharing violation on path /var/tmp/portage/dev-dotnet/gtk-sharp-2.10.2/work/gtk-sharp-2.10.2/glib/policy.2.6.glib-sharp.dll
  at System.IO.FileStream..ctor (System.String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, Boolean anonymous, FileOptions options) [0x00000] 
  at System.IO.FileStream..ctor (System.String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, Boolean isAsync, Boolean anonymous) [0x00000] 
  at System.IO.FileStream..ctor (System.String path, FileMode mode, FileAccess access) [0x00000] 
  at (wrapper remoting-invoke-with-check) System.IO.FileStream:.ctor (string,System.IO.FileMode,System.IO.FileAccess)
  at System.Reflection.Emit.ModuleBuilder.Save () [0x00000] 
  at System.Reflection.Emit.AssemblyBuilder.Save (System.String assemblyFileName, PortableExecutableKinds portableExecutableKind, ImageFileMachine imageFileMachine) [0x00000] 
  at System.Reflection.Emit.AssemblyBuilder.Save (System.String assemblyFileName) [0x00000] 
  at Mono.AssemblyLinker.AssemblyLinker.DoIt () [0x00000] 
Creating policy.2.8.glib-sharp.dll
Creating policy.2.8.glib-sharp.dll
Creating policy.2.8.glib-sharp.dll
ALINK: error A1019: Metadata failure creating assembly -- System.IO.IOException: Sharing violation on path /var/tmp/portage/dev-dotnet/gtk-sharp-2.10.2/work/gtk-sharp-2.10.2/glib/policy.2.8.glib-sharp.dll
  at System.IO.FileStream..ctor (System.String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, Boolean anonymous, FileOptions options) [0x00000] 
  at System.IO.FileStream..ctor (System.String path, FileMode mode, FileAccess access, FileShare share) [0x00000] 
  at (wrapper remoting-invoke-with-check) System.IO.FileStream:.ctor (string,System.IO.FileMode,System.IO.FileAccess,System.IO.FileShare)
  at System.IO.File.OpenWrite (System.String path) [0x00000] 
  at Mono.Security.StrongName.Sign (System.String fileName) [0x00000] 
  at System.Reflection.Emit.AssemblyBuilder.Save (System.String assemblyFileName, PortableExecutableKinds portableExecutableKind, ImageFileMachine imageFileMachine) [0x00000] 
  at System.Reflection.Emit.AssemblyBuilder.Save (System.String assemblyFileName) [0x00000] 
  at Mono.AssemblyLinker.AssemblyLinker.DoIt () [0x00000] 
make[3]: *** [policy.2.4.glib-sharp.dll] Error 1
make[3]: *** Waiting for unfinished jobs....
make[2]: *** [all-recursive] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2

Okay so there are some failures during calling “alink”, in particular it reports “sharing violations”. I suppose the name of the error message is derived from the original .NET as “sharing violation” is what Windows reports when two applications try to write to the same file at once, or one tries to write to a file that is locked down by someone else.

But I want to put some emphasis on something in particular:

Creating policy.2.4.glib-sharp.dll
Creating policy.2.4.glib-sharp.dll
Creating policy.2.4.glib-sharp.dll
[...]
Creating policy.2.6.glib-sharp.dll
Creating policy.2.6.glib-sharp.dll
Creating policy.2.6.glib-sharp.dll
[...]
Creating policy.2.8.glib-sharp.dll
Creating policy.2.8.glib-sharp.dll
Creating policy.2.8.glib-sharp.dll
[...]
make[3]: *** [policy.2.4.glib-sharp.dll] Error 1
make[3]: *** Waiting for unfinished jobs....
make[2]: *** [all-recursive] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2

As you can see each policy is reportedly created thrice. If you, like me, know what to look for in a parallel make failure, you’ll also notice that there are three policies being created there. This is quite important and interesting, as it already suggests to an experienced eye what the problem is, but let’s go on step by step.

Once again, we know the software is built with automake so you don’t expect parallel make failures, not from the default rules at least. But C#/Mono is not one of the languages that automake supports out of the box. Which means that almost surely there are custom rules involved.

As they are using custom rules, rather than automake the problem involves knowledge of GNU make (or any other make, but let’s assume GNU for now, it’s the most common in Free Software after all, for good or bad).

Let’s look for the “Creating” line in the Makefile.am file:

$(POLICY_ASSEMBLIES): $(top_builddir)/policy.config gtk-sharp.snk
        @for i in $(POLICY_VERSIONS); do        
          echo "Creating policy.$$i.$(ASSEMBLY)";       
          sed -e "s/@ASSEMBLY_NAME@/$(ASSEMBLY_NAME)/" -e "s/@POLICY@/$$i/" $(top_builddir)/policy.config > policy.$$i.config;  
          $(AL) -link:policy.$$i.config -out:policy.$$i.$(ASSEMBLY) -keyfile:gtk-sharp.snk;     
        done

If you had to deal with a similar failure before (as I did), you knew already what you were going to find in that rule. I’m referring to the for loop. It’s a common mistake for people not knowing make well enough to create a rule like this. They expect that declaring multiple targets in the rule means, for make “build all of these with a single command”, while it actually means “for any of these files, use this command to generate it”.

The result is that, as you’re going to need three different files, make will launch three times that code in parallel. Which not only will waste a huge amount of time but will also fail, as the three of them might try to access the same resource at once (like is happening here).

The solution for this kind of problem is not really obvious, as it often requires to rewrite the rules entirely. My usual way of thinking of the problem here is that whoever wrote the rule didn’t know make well enough and made a mistake, and it’s easier to just rewrite the rule.

Let’s decompose the rule then, ignoring the for loop, and the echo line, what we have is these two commands:

sed -e "s/@ASSEMBLY_NAME@/$(ASSEMBLY_NAME)/" -e "s/@POLICY@/$$i/" $(top_builddir)/policy.config > policy.$$i.config
$(AL) -link:policy.$$i.config -out:policy.$$i.$(ASSEMBLY) -keyfile:gtk-sharp.snk

Both of these two commands create a different file, one is intermediate, and is the policy configuration, the other is the final one. This again shows there’s a lack of understanding of how make is supposed to work, again a very common one, so I’m not blaming the developer here, make is a strange language. So there are two dependent steps involved here: the final requested result is the policy file, but to generate that you need the policy configuration.

Let’s start with the policy configuration then, the actual generation command is a simple sed call that takes the generic configuration and sets the assembly name and policy version in it. The problem here is obviously to replace the use of $$i (the variable used in the for loop) with the actual policy name. Just so we’re clear, the policy version is the 2.4, 2.6 and 2.8 string we have seen before. Luckily this is a pretty common task for a software like make and there is a construct that gets in our help: static pattern rules.

The name of the generated file is always in the format policy.$VERSION.config, and we need to know the $VERSION part for using it in sed. Nothing more suited for this than static pattern rules. Let’s replace the variable section of the filename with the magic symbol %, make will take care of expanding that as needed, and will also provide us a special variable in the rule, $* that will take the value of its expansion. The rule then becomes this:

policy.%.config: $(top_builddir)/policy.config
    sed -e "s/@ASSEMBLY_NAME@/$(ASSEMBLY_NAME)/" -e "s/@POLICY@/$*/" $(top_builddir)/policy.config > $@

And here we’ve created our policy configuration files, in a parallel build friendly way as none of them is dependent on the other, the three sed commands can easily be executed in parallel.

Now it’s time to create the actual policy assembly, again, we’re going to make use of the static pattern rules, and making the best use of the fact that you can also declare dependencies based on static patterns.

Instead of a simple two-entries rule, this is going to be a three-entries rule, the first entry defines the list of targets that this rule may apply to, that is the same as it was before ($(POLICY_ASSEMBLIES)), the second and third are the usual ones, defining target and dependencies.

While the original rule depended directly on the generic policy config, this one will only depend on the actual final config, as the rule we just wrote for the configuration files will take care of it. So the final rule to generate the wanted assembly will be:

$(POLICY_ASSEMBLIES) : policy.%.$(ASSEMBLY): policy.%.config gtk-sharp.snk
    $(AL) -link:policy.$*.config -out:$@ -keyfile:gtk-sharp.snk

At this point, the same has just to be applied to all the involved Makefile.am files in the package, like I did on the patch I submitted, and the package becomes totally parallel build friendly.

There is another nice addition to this: you’re trading one complex, difficult to read and broken rule with two one-liner rules, which makes the code much more readable and understandable if you’re looking for a mistake.

Between Mono and Java

Some time ago I expressed my feelings about C# ; to sum them up, I think it’s a nice language, by itself. It’s near enough C to be understandable by most developers who ever worked with that or C++ and it’s much saner than C++ in my opinion.

But I haven’t said much about Mono, even though I’ve been running GNOME for a while now and of course I’ve been using F-spot and, as Tante suggested, gnome-do.

I’ve been thinking about writing something about this since he also posted about Mono, but I think today is the best day of all, as there has been some interesting news in Java land.

While I do see that Mono has improved hugely since I last tried it (for Beagle), I do still have some reserves against Mono/.NET when compared with Java.

The reason for this is not that I think Mono cannot improve or that Java is technically superior, it’s more that I’m glad Sun finally covered the Java crap. OpenJDK was a very good step further, as it opened most of the important parts of the source code for others. But it also became more interesting in the last few days.

First, Sun accepted the FreeBSD port of their JDK into OpenJDK (which is a very good thing for the Gentoo/FreeBSD project!), and then a Darwin port was merged in OpenJDK. Lovely, Sun is taking the right steps to come out of the crap)

In particular, the porters project is something I would have liked to get involved in, if it wasn’t for last year’s health disaster.

In general, I think Java has now much more chances to become the true high-level multiplatform language and environment, over C# and Mono. This because the main implementation is open, rather than having one (or more) open implementations trying to track down the first and main implementation.

But I’d be seriously interested on a C# compiler that didn’t need Mono runtime, kinda like Vala.