What’s wrong with Gentoo, anyway?

Yesterday I snapped and declared my intent to resign from Gentoo together with stopping the tinderbox and leaving the use of Gentoo either. Why did that happen? Well, it’s a huge mix of problems, all joined together by one common factor: no matter how much work I pour into getting Gentoo working like it should be, more problems are generated by sloppy work from at least one or two developers.

I’m not referring about the misunderstandings about QA rules, which happens and are naturally caused by the fact we’re humans and not being of pure logic (luckily! how boring it would be otherwise, to always behave in the most logical way!). Those can upset me but they are still after all no big deals. What I’m referring to is the situation where one or two developers can screw up the whole tree without anybody being (reasonably) able to do a thing about it. We’ve had to two (different) examples in the past few months, and while both have undeniably bothered QA, users, and developers alike, no action has been taken in any of these cases.

We thus have developer A, who decided that it’s a good idea to force all users to have Python 3 installed on their systems, because upstream released it (even when upstream consider it still experimental, something to toy with), and who kept on ignoring calls for dropping that from both users and developers (luckily, the arch teams are not mindless drones, and wouldn’t let this slide to stable as he intended in the first place). The same developer also hasn’t been able to properly address one slight problem with the new wrapper after months from the unleashing of that to the unstable users (unstable does not mean unusable).

Then we have developer B who feels like the tree’s saviour, the only person who can make Gentoo bleeding edge again… while most of if not all of the rest the developer pool is working on getting Gentoo more stable and more maintainable. So, among the things he went on doing, there was a poorly-performed Samba bump (suboptimal was the term he used — I ended up having to fix the init scripts myself because they weren’t stopping/restarting properly, as the ebuild and the init scripts went out of sync regarding paths), some strangely incomplete PostgreSQL changes, and a number of minor problems with the packages.

Of the two, I was first upset most by the former, but on the long run, the latter is the one who drove me mad. Let’s not dig too much on the stance about --as-needed (cosmetics — yeah because being able to return from a jpeg bump with less than 100 packages, rather than the whole world, is just cosmetics), and the fact that he’s ignored most of the QA issues with the packages he touched. Instead look at the behaviour with a package of mine (alas, I made the mistake of let this one slip with just a warning, I should have taken the chance to actually defer it to devrel…): vbindiff.

The package is something I added a while ago because from time to time it comes out useful. I’m in metadata.xml; I’m definitely not an unresponsive maintainer. Yet, while my last bump was on June 2008, the version in tree was not the latest one up to last September (2009). Why? A quick glance at the homepage shows that the beta4 release was mostly fixing a Win32 bug, and introducing a way to enable debug-mode. So what happens? Our mighty developer decides to go on and bump the package; without asking me; with nobody asking him; without a mail, a nod or anything. I literally notice this as emerge tries to upgrade a package I know I maintain. You’d expect for the debug support to be present in the ebuild then, and you’d find a debug USE flag if you checked now indeed, but that’s something I added myself afterwards, as the damage of pointlessly bumping something was already done.

Now, why did that happen? Well, he admitted he just went through the dev-* categories, without considering maintainers declared in metadata, and blindly bumped ebuilds when the latest version available on the site was higher than the one in tree. Case in point he had to open the vbindiff site and thus the release notes regarding Win32 and --enable-debug would have been clearly visible, if he cared to even read part of them. Whoever tried doing serious ebuild business should know that most of the time even the upstream-provided release notes are not something to go on by… Interestingly enough, his bleeding-edge hunger didn’t make him ask for a new stable, and we currently have a very old one.

So there we have your developer B, the super-hero, the last good hope of the bleeding edge, who bumps packages without consulting the guy who maintain them (and is around almost 247) and without even caring to use them at all. Why did I let it slip? Because I was most focused on trying to stop developer A at the time is probably the right answer. I did issue a reprimand reminding him to not touch someone else’s packages, and to learn using package.mask for things like Samba. I was hoping he would listen. Oh boy, was I ever so wrong.

Speaking a second again about Samba, did I tell you yet that the split into multiple packages was done, straight to ~arch, without any plan to follow-up to convert dependencies? Wonder why the whole thing is now stalemated again. Maybe the arch teams don’t see it all too well to have the same kind of dependency breakage in stable as there was/is on unstable right now.

First-hand information about our developer B wants him to be inlined with a zealot point of view regarding the Mono project — you’d then guess that dotnet stuff would be the last thing he’d be touching, but instead, without any questioning, ignoring the fact I stated at FOSDEM that I was going to look into that as soon as I had time, the fact that I stated before multiple times that I was already working on un-splitting the gtk-sharp packages, and the fact that I took contact with the Mono developers (again at FOSDEM) to try following upstream more closely. Oh and the one thing that pissed me off about that bump? Beside the fact that tomboy now refuses to work? Remember this patch? It was dropped; without even mailing me if I had or could make a version for the latest version. It was dropped in unstable (or, how it should be called if this kind of stuff is allowed to continue, unusable).

And the cherry on top? As I said, this developer touched Samba, PostgreSQL, now Mono… there are three aliases for these things (samba, pgsql-bugs and dotnet), who the bugs are assigned to… he’s on none of them! And before somebody tries to argue that, I’m pretty confident he’s not following the aliases on the Bugzilla (plus, given he also argued that the problem was with leaving security-vulnerable stuff in the tree – which by the way means having working, complete, safe ebuilds to be able to mark stable, and he doesn’t seem to be able to come up with any of those – the most important security bugs don’t get sent to watchers). How does he suppose to see the bugs coming? Oh but by wrangling the bug himself! Yeah, after all developers don’t file bugs themselves assigning them straight to the maintainers by procedure, do they? (fun fact: Bugzilla queries report at most 5K bugs, so that list is a very much limited result from what I was hoping to get); nor do other developers ever wrangle it would be silly, and there is no Arch Tester to speak of, right?

You can now see most of the pictures, and why I’m mostly upset with developer B. What made me snap yesterday were remarks that insisted that I was just “whining” and “not doing enough” as bugs kept piling up. What the heck? I constantly had over 1000 bugs (over 1300 today) for the past year or so, I know very well that bugs keep piling up! And I’ve been doing all I can do outside of my work hours (while I have to thank some people, including Paul, David, Simon, Andrew and Bela for their contributions, I’m not paid to do Gentoo work; and while I do get to use it, and thus contribute back to, for some of the jobs I take, it’s definitely not the same as working on Gentoo), including the whole RubyNG porting and improvement trying to make sure we can actually get to a point where unmasking Ruby 1.9 will not break any user whatsoever. Am I really doing too little? ”Not enough”?

Okay so the proper way to handle this, with the current procedures, would be to take this up to the Developers’ Relations so that they could act on it; QA can only ask infra to restrict commit access if we’re expecting a grave and dangerous breaking of the tree, or misuse of commit rights. So why didn’t I bring this up to devrel? Well, the main reason is that devrel nowadays, as far as I can tell, is exactly three people: Petteri, Denis and Jorge, and of the three the only one who’s for preventive suspension of commit rights is Denis (this has been proven with the case about developer A above); one out of three does not really sound much of a chance for this to improve the situation. And if – again as happened with developer A – DevRel then decided that the right action would be to issue a reprimand, that would amount to scolding the developer and asking to work more with others… well, it wouldn’t change a thing.

The whole QA system has to change! We’ve got to write down guidelines, rules, and laws, and be conservative in applying them. You shouldn’t go around breaching them and then appealing when QA finds you out of line, you should talk with QA if you feel the rule is misapplied to your case in any way.

So here you go, in a nutshell, why my preservation instinct right now is telling me to flee. I’m not sure yet if I’ll outright flee or just give it time for the situation is addressed and then decide. The reason is: I still like the Gentoo system, and since I rely on it for my work I cannot leave it alone; if I were to move to anything else I would have to spend (waste?) even more time to fix the same issues anyway, and I’d much rather get Gentoo working right. But I cannot do this alone, I cannot do this especially if I have support neither from developers nor users. So please voice your concern.

If you feel like Gentoo needs the better QA, if you feel like we shouldn’t be translating unstable to unusable, then please ask for it. I’m not saying that we should become stale like Debian stable, but if it takes a few months to get something straight, then it should take its time and not be forced through (that’s what the Ruby team has been doing all this time to work with Ruby 1.9 and Ruby EE and other implementations as well!). If you use Twitter, identi.ca, Digg, Reddit, Slashdot, whatever, get this post running. Maybe I’m subverting the process, but to quote BBC’s NewsQuiz, “Trial by media is the most efficient form of justice” (this was in reference to the British MP expenses scandal last year), and right now my only concern is effectiveness.

Tip of the day: if Samba 3.4 fails to work…

I fought with this today… if you are running Gentoo ~arch you probably noticed that the current Samba support is “definitely suboptimal” (to use the words of the current maintainer) and indeed it failed to work on me once again (third time; the first was a broken init script; the second was missing USE deps so I was quite upset). If you find yourself unable to log-in Samba, you need to consider two possible problems.

First problem: the Samba password system seems to have either changed or moved so you have to re-apply the password to your user (and re-add the user as well!). To do so you have to use the smbpasswd command. Unfortunately this will fail to work when the system has IPv6 configured. And here comes the next problem.

Samba is likely having trouble upstream to deal with IPv6; indeed it comes down to having the smbpasswd command trying to connect to 127.0.0.1 (IPv4), but the smbd daemon is only opening :: (IPv6), so it’ll fail to connect and won’t let you set your password. To fix this, you have to change the /etc/samba/smb.conf file, and make sure that the old IPv4 addresses are listened to explicitly. If you got static IPs this is pretty simple, but if you don’t, you’ll have a little more complex situation and you’ll be forced to restart samba each time the network interface changes IP, I’m afraid (I haven’t been able to test that yet).

[global]
interfaces = 127.0.0.1 wlan0 br0
bind interfaces only = yes

As you can see we’re asking for some explicit interfaces (and the localhost address) to be used for listening; since samba uses the IPv4 localhost address for the admin utilities you explicit that to make sure it listens to that. For some reason I cannot understand, when doing this explicitly, samba knows to open different sockets for both IPv4 and IPv6, otherwise it’ll open it for IPv6 only.

I’m not even going to fight with upstream about this, I’m tired and I’m tracking down a bug in Gtk#; a nasty one that crashes the app when using custom cell renderers, and I already fixed iSCSI Target for kernel 2.6.32 (as well as version-bumped it).

Mono might not be perfect…

… but I still like it as a technology, that is after working with it for a few months already, and now having understood a few of its quirks.

First of all, it’s not true at all that simply using Mono leads to perfect compatibility between Windows, Linux and OSX. Not without having to write some extra code. While Microsoft did design some abstraction, like System.Environment.OSVersion.PlatformID that allows to find which OS we’re actually running on, there is an obvious design by default oriented to Windows. It’s not like that’s something surprising.

For instance, .NET has its own way to store configuration settings, but this is very Windows-centric, since, as far as I can see, the configuration file is to be store together with the executable (this, then, wouldn’t work with properly installed Linux applications, since then the executable will be in a path the user has no write access to).

Another problem is that while there are interfaces to identify some of the standard location for files (like documents, pictures, the home directory and stuff like that), and Mono obviously handle them gracefully under Unix following the XDG Base Directory Specification which is quite a nice touch, it does not have the perfect granularity yet; for instance it does not provide a transparent way to handle the path for cache files (not-too-temporary, but not even persistent).

For these things, and probably a few more issues I’m finding, I’m currently writing a library, that I called, with very little imagination, Portability, which provides some generic interfaces to this kind of stuff; my intent is, if it makes sense at the end, to release it under either LGPL-3 or BSD license, and have it as a standalone project.

There are also a few more issues, like the sgen tool that should automatically generate static serialisation code to save and restore classes on files; unfortunately for some reason it doesn’t seem to work at all here: it aborts on an IDictionary object, but the class I’m asking it to serialise does not have any Dictionary in it; sigh. But that doesn’t matter since at least the dynamic, slow, reflection-based serialisation works, mostly.

At the end, I find it much nicer to work with C# than with Java (which would have been the alternative for this project for instance).

Mono and Gentoo: integration needs more work

I’ve already written quite a bit about the fact that I’m mostly a Mono enthusiast and that I think there is work to be done to integrate Mono-based builds into autotools but I haven’t spent enough words about the integration of Mono in Gentoo as a distribution.

Indeed, the Mono team right now seem to be uniquely implemented by Peter Alfredsen (loki_val), which is of course a sub-optimal situation, since nothing should really end up being done by a single person in theory; in practice that’s more than common in Gentoo and that is one of our worst problems. And it’s not just a matter of not having the time to deal with everything, but also that you cannot brainstorm to separate the bad ideas from the good ones, and to polish them so that they can be used by more than a few people.

In paticular, there are quite a few things in the way Gentoo handles Mono that I’d love to see improved, but that I doubt would be considered unless me and someone else would join the team to discuss about them. Now, some things are definitely subjective – for instance I don’t like having upstream packages split in multiple ebuilds, especially now that we got USE-based dependencies, while it seems to be something that Peter loves to do – but others are definitely areas that need some work.

The first problem relates to where do we install Mono files: for some reason, in Gentoo we’re currently installing the Mono libraries under /usr/lib64, for AMD64 multilib systems; this is probably due to the fact that usually they also install 64-bit libraries and thus their libdir is supposed to be suffixed with 64. Unfortunately this is against what upstream uses, since Novell uses /usr/lib for it all — indeed, all the .NET libraries, the .dll and .exe files, are arch-independent, or actually platform-independent, for the most part (see this post for more details about how is it possible for .NET libraries to be arch-dependent). We’re stuck at patching lots of libraries, just like Fedora, because of that path change.

Another problems appears when you factor in the problem of ABI and .NET libraries: while Ruby, Perl and Python don’t really have ABI (at least between programs and native libraries), and Java have ABI but no ABI-definition (thus requiring a lot of manual work for the Java team), .NET policies come very near to ELF files and versioning, for which we have revdep-rebuild already. Unfortunately we have no similar tool for .NET (and it wouldn’t always work fine, given that undefined symbols in .NET are not fatal, and can easily be handled — for instance the software I’m developing only uses Outlook if it can load all its libraries).

Also, that I know of at least, there is no script to verify the properness of the runtime dependencies of Mono software, which is quite a bit of a problem when you end up packaging it yourself. I’m pretty sure somewhere there is a tool to check dependencies akin to the Dependency Walker but I really don’t know about it (if somebody has a name, that would probably be appreciated!).

All in all, there aren’t really big problems with Mono in Gentoo; they really appear no problems at all when you consider what we have yet to fix with Ruby but they still can be a bit of a bother. And they need more people for them to be fixed.

Sometimes automake is no good… Mono projects

You know I’m an autotools-lover, and that I think that they can easily be used for almost all projects. Well sometimes they are not, one of this cases is Mono projects. Indeed, you may remember that I’m working more closely with Mono lately for my job, and that has made me notice that there really aren’t many great ways to build Mono projects.

Most of the libraries out there are written with Visual Studio in mind, and that’s not excessively bad if they provide solution files for Visual Studio 2008, since MonoDevelop’s mdtool can deal with them just fine (unfortunately it doesn’t deal with VS2k5 solutions which is what stopped me from adding FlickrNet in source form to Portage, limiting myself to the binary version).

There are a few custom Makefiles out there and some mostly wok, but there are also projects like f-spot that provide an autotools-based build system… but with a lot of custom code in it that, in my view, makes the whole thing difficult to manage: indeed it gets out a mix of automake and manual make rules that don’t really rock my boat properly.

It’s not that I just don’t like the way the Makefiles are written, but factoring in the fact that automake does not support C#/Mono natively, you get to the point where:

  • the support for dependencies is just not there;
  • automake is designed to support language where a source file is translated into an object file, and then linked; C# does not work in that way since all the source files are given when building a single assembly;
  • the support for various flags variables is just pointless with the way the compiler work.

I guess there are mostly two reasons why autotools are still used by C#/Mono based projects, the first is that it integrates well when you also have some native extensions, like F-Spot has, and the latter is that it provides at least some level of boilerplate code.

I guess one interesting project would be to replace Makefile.am with some kind of Makefile.mb (Mono-Build or something along those lines for a name) so that they could generate some different Makefile.in files, without all the pointless code coming from automake and not used by C#/Mono builds, but still interface-compatible with it so that commands like make, make clean and make dist work as intended.

Why I’m using Mono

In the past week I’ve written a bit about some technical problems with Mono, very far from the political, ethical, and theoretical problems that Richard Stallman talked about, and most people seems to have a problem with. I have been asked why am I using Mono at all, given I’m a Linux developer, so I’ll try to summarise the answer here.

The main issue is that I need to work: bills, hardware and all that stuff needs to be paid for, and trust me, at least here pure Free Software development does not pay enough; the part that brings in more money for the same effort is custom proprietaryware; some of it is not even tremendously unethical because it’s customizations that stops directly at my customer, rather than being distributed further down the way. This kind of software needs to work on proprietary operating systems as well, and that includes both Windows and OS X; Mono is a good choice for this, in my opinion.

Now of course you could be reading this wrong and say that I’m giving reasons for Mono not to be used for free software; on the other hand, it happens that sometimes I have to be writing free software that works on multiple operating systems, and that also is simpler to do with Mono. Case in point, I’ve started looking into writing a package that can be used to automatically back up the games’ savedata from PSP (PlayStation Portable) for me and a few friends of mine mostly (but of course would be released as free software); since they use Windows (and OS X), I’ll be writing it in Mono again.

Now, the view that Windows and OS X are important for software is somewhat debatable; I know lots of people complained about KDE wanting to port to Windows; I for sure complained that KDE went with Windows portability as a major priority (which is why CMake was selected, after all), but of course a pragmatic view shows that yeah sometimes it’s better to keep in mind that lots of users will use free software on those operating systems as well. This actually works fine, because a friend of mine who’s now using a lot of free software under Windows, including Pidgin and Firefox of course, will have a lot less problems to migrate to Linux one day (and one day he will, I’m sure) than those who are still stuck with Microsoft’s own messenger and Internet Explorer.

Also, I’m sure somebody will be saying that Qt 4 and KDE support Windows as well, so what’s the point of using Mono. Well… while GCC is improving their Windows/PE support with each version (and I know one person who at least has been working a lot on that subject), there are quite a few important features that it’s still lacking; so even if you look around, most of the free software, including KDE, that works under Windows, tend to be compiled with the Microsoft Visual C++ compiler. Which means you rely heavily on another huge piece of proprietaryware (and indeed, a pretty bad one at that).

I know it’s probably heavily a matter of taste, but I prefer working with Mono than having to deal with either the GCC problems or the Visual C++ compiler.

The end of the mono-debugger saga

So after starting inspecting and finding the problem last night I finally had a tenative patch that makes mdb work fine.

Indeed, I simply implemented some extended debuglink file support into the Bfd managed wrapper, which finds sections and symbols in the debuglinked file whenever they are not found in the original file. This solves my problem, although it might not be complete yet, since I have written it in 20 minutes. I’ve attached the version for trunk on my bug report and I’ll add my backport to 2.4.2 to my overlay today. After a bit of testing, I hope to get it in main tree too.

Speaking of testing, the mono-debugger ebuild had a test restriction, with no bug referenced; I’m quite sure that the tests that do fail are the ones that should have told us that mono-debugger wouldn’t have worked on the default Gentoo install at all. I’ll probably have to add some logic to warn the user about split-debug setups (please not that our default of stripping files of debug information does not strip the symbol table of libpthread.so, otherwise also gdb wouldn’t work at all; and lets mdb work fine, so it’s only a problem with split-debug).

After the debugger finally started to work, I also found another problem: mono itself does not seem to load libraries requested by DllImport through the standard dlopen() interface, but it looks for them in particular directories; which don’t include all the possible directories at all. This became a problem because the current default version of libedit in Gentoo does not have a soname, and it caused mono to find a libedit.so that was not a library at all (but rather an ldscript). But that’s a problem for another day, and my solution is just to use a newer libedit version that works fine.

Now I’ll go back to my tinderbox, and in the next few days you’ll probably see a few more posts about different topics than Mono… even though I have a few patches to post there as well.

The debugged debugger — part 2

So after my last night’s post I finally found the problem.

Actually, my mixing in the new system libbfd sidetracked me for about an hour, because the same symptoms were caused by an API change that I didn’t maintain correctly; after that I was able to use both system and internal libbfd with the same exact results.

I started adding printing checkpoints within both in the C# Bfd wrapper and in the C glue code that called into libbfd; it’s not really an easy thing, because, well, libbfd is probably one of the most over-engineered libraries I have ever seen. It really provides a lot information for a lot of different executable and binary formats, but to do that it increases tremendously the complexity; indeed that’s one of the reasons why gold is much faster than standard ld and why I preferred to write my own Ruby-Elf rather than binding the Bfd interface and build up from that (which could have been more complete under a few circumstances).

At any rate, I was lucky to have enough knowledge about ELF files to identify the issue at the end, most people who wouldn’t have seen ELF would have given up along the way. At the end I cut down the chase to noticing that it was trying to load the symbol table (.symtab, which includes internal local symbols — symbols marked static and thus not exported), and found none. Since it wouldn’t be able to find any symbol you’d be surprised if it were to actually match the nptl_version variable I talked about yesterday.

Going down on that line, it turned out that, albeit Mono splits debug symbols in a different file (.mdb), mdb does not support the feature that allows to do that with ELF files: our splitdebug. I actually was wondering if that was the problem from the start, but then I ruled it out because Fedora also uses the same feature, and there mono-debugger starts fine. I now replaced “work fine” with “starts fine” as you’ll see in a moment.

So if mdb does not support split debug files, how on earth can it work on Fedora? Well, the symbol it’s trying (and failing) to identify here is nptl_version from libpthread.so.. a quick check on the laptop told me that Fedora does not strip .symtab from libpthread.so! I was actually afraid that Fedora weren’t stripping .symtab at all, but then I started using the /usr/bin/mono object as a reference, and there you cannot find the .symtab section at all: Fedora has a special case for libpthread.

Now, the quick solution would be of course to just not strip libpthread.so of its .symtab either, so that mdb could start properly; the problem with that solution is that you wouldn’t be able to get backtrace or anything else out of the unmanaged code because it wouldn’t be loading that at all. On distributions that use split debug (Gentoo if requested, Fedora, and I have no idea what else), mono-debugger would start, if libpthread.so has .symtab, but it won’t work with any object that has .symtab on the debug file; which is our case. So I’ll try to find time to actually fix it in mono-debugger; because it is a bug in mono-debugger, or maybe a missing feature, not a problem with “roll your own optimization flags” as Miguel wanted it to be.

Maybe this will convince them that maybe they should try to give credit to other distributions as well? Who knows, I hope so because I see that at least for what concerns building and packaging, mono-debugger has a huge space for improvement, and I’d like to help out with that, if they allow me.

Post scriptum: I was also able to make mono-debugger use the system libedit, the result is less spectacular than using system libbfd, but it’s still nice:

flame@yamato mono-debugger-2.4.2 % qsize mono-debugger
dev-util/mono-debugger-2.4.2: 25 files, 21 non-files, 2021.133 KB
flame@yamato mono-debugger-2.4.2 % qsize mono-debugger
dev-util/mono-debugger-2.4.2: 25 files, 21 non-files, 1561.300 KB

Now if only I could get it to work …

The neverending fun of debugging a debugger

In the previous post of mine I’ve noted that I found some issues with the Mono-implemented software monosim. Luckily upstream understood the problem and he’s working on it. In the mean time I’ve had my share of fun because mono-debugger (mdb) does not seem to work properly for me. Since I also need Mono for a job task I’m working on, I’ve decided to work on fixing the issue.

So considering my knowledge of Mono is above the average user, but still not that high, I decided to ask on #mono (on gimpnet). With all due respect, the developers could really try to be friendly, especially with a fellow Free Software enthusiast that is just looking for help to fix the issue himself:

 thread_db is a libc feature I think to do debugging
 Chances are, you are no an "interesting" Linux distro
 One of those with "Roll your own optimization flags" that tend to break libc
 miguel_ miguel
 miguel, yes using gentoo but libc and debugging with gdb are fine...
 I knew it ;-)
 Yup, most stuff will appear to work
 But it breaks things in subtle ways
 and I can debug the problem libc side if needed, I just need to understand what's happening mono-side
 You need to complain to the GDB maintainers on your distro
 All the source code is available, grep for the error message
 Perhaps libthread_db is not availabel on your system
 it is available, already ruled the simple part out :)
 and yes, I have been looking at the code, but I'm not really that an expert on the mono side so I'm having an hard time to follow exactly what is trying to do

As you can see, even if Miguel started already with the snarky comments, I tried keeping it pretty lightweight; after all, Lennart does have his cheap shots at Gentoo, but I find him a pretty decent guy after all…

Somebody else, instead, was able to piss me off in a single phrase:

 i thought the point with gentoo was that if you watch make output scrolling, you can call yourself a dev ;)

Now, maybe if Mr Shields were to actually not piss other developers off without reason, he wouldn’t be badmouthed so much for his blogs. And I’m not one of those badmouthing him, the Mono project or anything else related to that up to now. I actually already stated that I like the language, and find the idea pretty useful, if with a few technical limitations.

Now, let’s get back to what the problem is: the not-very-descriptive error message that I get from the mono debugger (that thread_db, the debug library provided by glibc, couldn’t be initialised) is due to the fact that glibc tries to check if the NPTL thread library is loaded first, and to do that it tries to reach the (static!) variable nptl_version. Since it’s a static variable, nm(1) won’t be able to see it, although I can’t seem to find it with pfunct either; to be precise, it’ll be checking that the version corresponds too, but the problem is that it’s not found in the first place.

Debugging this is pretty difficult: the mono-debugger code does not throw an exception for the particular reason that thread_db couldn’t be initialised, but simply states the obvious. From there, you have to backtrace manually in the code (manually at first because mono-debugger ignored all the user-provided CFLAGS, included my -ggdb to get debug information!), and the sequence call is C# → C (mono-develop) → C (thread_db) → C (mono-develop) → C# → C (internal libbfd). Indeed it jumps around with similarly-called functions and other fun stuff that really drove me crazy at first.

Right now I cut the chase at knowing that libbfd was unable to find the libpthread.so library. The reason for that is still unknown to me, but to reduce the amount of code that is actually being used, I’ve decided to remove the internal libbfd version in favour of the system one; while the ABI is not stable (and thus you would end up rebuilding anything using libbfd at any binutils bump), the API doesn’t usually change tremendously, and there usually is enough time to fix it up if needed; indeed from the internal copy to the system copy, the only API breakage is one struct’s member name, which I fixed with a bit of autotools mojo. The patches are not yet available but I’ll be submitting them soon; the difference with an without the included libbfd is quite nice:

flame@yamato mono-debugger-2.4.2 % qsize mono-debugger
dev-util/mono-debugger-2.4.2: 25 files, 21 non-files, 4944.144 KB
flame@yamato mono-debugger-2.4.2 % qsize mono-debugger
dev-util/mono-debugger-2.4.2: 25 files, 21 non-files, 2020.972 KB

In the package there is also an internal copy of libedit; I guess because it’s not often found in distributions, but we have it, and on Gentoo/FreeBSD it’s also part of the system, so…

Now, no doubt that this hasn’t brought me yet to find what the problem is, and it’s quite likely that the problem is Gentoo specific since it seems to be working fine both on my Fedora and other systems. But is the right move for the Mono team to diss off a (major, I’ll have to say) developer of a distribution that isn’t considering removing Mono from their repository?

In the land of smartcards

Even though I did post that I wanted to get onto hardware signatures I ended up getting an USB smartcard reader for a job that requires me to deal with some kind of smartcards; I cannot go much further on the matter right now though, so I’ll skip over most of the notes here.

Now, since I got the reader, but not yet most of the specifics I need to actually go on with the job, I’ve been playing with actually getting the reader to work with my system. Interestingly enough, as usual, the first problem is very Gentoo-specific: the init script does not work properly, and I’m now working on fixing that up.

But then the problem is to actually find a smart card to test with; in my haste I forgot about getting at least one or two smartcards to play with when I ordered the device, and now it’d be stupidly expensive to order them. Of course I’ll go around this time and get myself the Italian electronic ID card (CIE), but even that does not come cheap (€25, and a full morning wasted), and I cannot just do that right now.

So I went around to see what I had at home with a smartcard chip, after discarding my old, expired MasterCard (even though I thought about it before, I was warned against trying that), I decided to try with a GSM SIM card, which I had laying around (I had to get a new one to switch my current phone plan to a business subscriber plan; before I was using a consumer pre-paid plan).

Now, although I was able to test that the reader detects and initialises the card correctly (although it is not in the pcsc-tools database!), I wanted to see if it was actually possible to access it fully; luckily the page of a Gentoo user sent me to some software, written by an Italian programmer, that should do just that: monosim which, as you’d expect, is written in C# and Mono, which is good given I’m currently doing the same for another customer of mine.

Unfortunately, it seems like the mono problem comes up again: upstream never considered the fact that the libpcsclite.so ABI changes between different architectures, even on the same operating system. Not that I find that a good idea in general, since I always try to stick with properly-sized parameters (thanks stdint.h), but it happens, and we should get ready to actually resolve the problems when they appear.

Now, I really don’t even want to get started with all the mess that RMS have uncovered lately; just like I did a few years back, I replace the idealistic problems from Stallman with technical limitations, see for instance my post about “the java crap” (which – by the way – hasn’t finished being a problem, outlasting the idealistic problems).

And I’m still waiting for Berkeley DB to finish its testsuite, after more than twelve (12!) hours, on an eight core system, with parallel processes (I get five TCL processes to hog up the same amount of cores at almost any time). I don’t even want to think how long it would take on a single-core system. Once that’s done, I can turn the system down for some extraordinary maintenance.