This Time Self-Hosted
dark mode light mode Search

The neverending fun of debugging a debugger

In the previous post of mine I’ve noted that I found some issues with the Mono-implemented software monosim. Luckily upstream understood the problem and he’s working on it. In the mean time I’ve had my share of fun because mono-debugger (mdb) does not seem to work properly for me. Since I also need Mono for a job task I’m working on, I’ve decided to work on fixing the issue.

So considering my knowledge of Mono is above the average user, but still not that high, I decided to ask on (on gimpnet). With all due respect, the developers could really try to be friendly, especially with a fellow Free Software enthusiast that is just looking for help to fix the issue himself:

<miguel> thread_db is a libc feature I think to do debugging
 Chances are, you are no an "interesting" Linux distro
 One of those with "Roll your own optimization flags" that tend to break libc
<Flameeyes> miguel, yes using gentoo but libc and debugging with gdb are fine...
<miguel> I knew it ;-)
 Yup, most stuff will appear to work
 But it breaks things in subtle ways
<Flameeyes> and I can debug the problem libc side if needed, I just need to understand what's happening mono-side
<miguel> You need to complain to the GDB maintainers on your distro
 All the source code is available, grep for the error message
<miguel> Perhaps libthread_db is not availabel on your system
<Flameeyes> it is available, already ruled the simple part out :)
 and yes, I have been looking at the code, but I'm not really that an expert on the mono side so I'm having an hard time to follow exactly what is trying to do
Code language: plaintext (plaintext)

As you can see, even if Miguel started already with the snarky comments, I tried keeping it pretty lightweight; after all, Lennart does have his cheap shots at Gentoo, but I find him a pretty decent guy after all…

Somebody else, instead, was able to piss me off in a single phrase:

<directhex> i thought the point with gentoo was that if you watch make output scrolling, you can call yourself a dev ;)
Code language: plaintext (plaintext)

Now, maybe if Mr Shields were to actually not piss other developers off without reason, he wouldn’t be badmouthed so much for his blogs. And I’m not one of those badmouthing him, the Mono project or anything else related to that up to now. I actually already stated that I like the language, and find the idea pretty useful, if with a few technical limitations.

Now, let’s get back to what the problem is: the not-very-descriptive error message that I get from the mono debugger (that thread_db, the debug library provided by glibc, couldn’t be initialised) is due to the fact that glibc tries to check if the NPTL thread library is loaded first, and to do that it tries to reach the (static!) variable nptl_version. Since it’s a static variable, nm(1) won’t be able to see it, although I can’t seem to find it with pfunct either; to be precise, it’ll be checking that the version corresponds too, but the problem is that it’s not found in the first place.

Debugging this is pretty difficult: the mono-debugger code does not throw an exception for the particular reason that thread_db couldn’t be initialised, but simply states the obvious. From there, you have to backtrace manually in the code (manually at first because mono-debugger ignored all the user-provided CFLAGS, included my -ggdb to get debug information!), and the sequence call is C# → C (mono-develop) → C (thread_db) → C (mono-develop) → C# → C (internal libbfd). Indeed it jumps around with similarly-called functions and other fun stuff that really drove me crazy at first.

Right now I cut the chase at knowing that libbfd was unable to find the libpthread.so library. The reason for that is still unknown to me, but to reduce the amount of code that is actually being used, I’ve decided to remove the internal libbfd version in favour of the system one; while the ABI is not stable (and thus you would end up rebuilding anything using libbfd at any binutils bump), the API doesn’t usually change tremendously, and there usually is enough time to fix it up if needed; indeed from the internal copy to the system copy, the only API breakage is one struct’s member name, which I fixed with a bit of autotools mojo. The patches are not yet available but I’ll be submitting them soon; the difference with an without the included libbfd is quite nice:

flame@yamato mono-debugger-2.4.2 % qsize mono-debugger
dev-util/mono-debugger-2.4.2: 25 files, 21 non-files, 4944.144 KB
flame@yamato mono-debugger-2.4.2 % qsize mono-debugger
dev-util/mono-debugger-2.4.2: 25 files, 21 non-files, 2020.972 KB
Code language: plaintext (plaintext)

In the package there is also an internal copy of libedit; I guess because it’s not often found in distributions, but we have it, and on Gentoo/FreeBSD it’s also part of the system, so…

Now, no doubt that this hasn’t brought me yet to find what the problem is, and it’s quite likely that the problem is Gentoo specific since it seems to be working fine both on my Fedora and other systems. But is the right move for the Mono team to diss off a (major, I’ll have to say) developer of a distribution that isn’t considering removing Mono from their repository?

Comments 5
  1. Flameeyes, you have to admit it, you have the power of attracting bad words to yourself :).

  2. For those interested, it seems like the bfd interface in mono-debugger ignores debuginfo files, which are used by splitdebug I’m using. I’ll elaborate on that tomorrow.

  3. I really do like reading your blog, always find something useful and interesting and unique and the neverending depth of how software (especially free software) works and is built never ceases to amaze me.Let me assure you that people like me are very thankful for the work you’re doing.

  4. this DeIcaza/Mono saga is the perfect example of myths and urban-legends around Gentoo and its staff; it’s also a perfect example of how binary distros never fix problems: they prefer to apply workarounds over workarounds in an endless loop.wonderfull job Diego! now I understand why I was not able to debug banshee due to libthread_db not found!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.