The debugged debugger — part 2

So after my last night’s post I finally found the problem.

Actually, my mixing in the new system libbfd sidetracked me for about an hour, because the same symptoms were caused by an API change that I didn’t maintain correctly; after that I was able to use both system and internal libbfd with the same exact results.

I started adding printing checkpoints within both in the C# Bfd wrapper and in the C glue code that called into libbfd; it’s not really an easy thing, because, well, libbfd is probably one of the most over-engineered libraries I have ever seen. It really provides a lot information for a lot of different executable and binary formats, but to do that it increases tremendously the complexity; indeed that’s one of the reasons why gold is much faster than standard ld and why I preferred to write my own Ruby-Elf rather than binding the Bfd interface and build up from that (which could have been more complete under a few circumstances).

At any rate, I was lucky to have enough knowledge about ELF files to identify the issue at the end, most people who wouldn’t have seen ELF would have given up along the way. At the end I cut down the chase to noticing that it was trying to load the symbol table (.symtab, which includes internal local symbols — symbols marked static and thus not exported), and found none. Since it wouldn’t be able to find any symbol you’d be surprised if it were to actually match the nptl_version variable I talked about yesterday.

Going down on that line, it turned out that, albeit Mono splits debug symbols in a different file (.mdb), mdb does not support the feature that allows to do that with ELF files: our splitdebug. I actually was wondering if that was the problem from the start, but then I ruled it out because Fedora also uses the same feature, and there mono-debugger starts fine. I now replaced “work fine” with “starts fine” as you’ll see in a moment.

So if mdb does not support split debug files, how on earth can it work on Fedora? Well, the symbol it’s trying (and failing) to identify here is nptl_version from a quick check on the laptop told me that Fedora does not strip .symtab from! I was actually afraid that Fedora weren’t stripping .symtab at all, but then I started using the /usr/bin/mono object as a reference, and there you cannot find the .symtab section at all: Fedora has a special case for libpthread.

Now, the quick solution would be of course to just not strip of its .symtab either, so that mdb could start properly; the problem with that solution is that you wouldn’t be able to get backtrace or anything else out of the unmanaged code because it wouldn’t be loading that at all. On distributions that use split debug (Gentoo if requested, Fedora, and I have no idea what else), mono-debugger would start, if has .symtab, but it won’t work with any object that has .symtab on the debug file; which is our case. So I’ll try to find time to actually fix it in mono-debugger; because it is a bug in mono-debugger, or maybe a missing feature, not a problem with “roll your own optimization flags” as Miguel wanted it to be.

Maybe this will convince them that maybe they should try to give credit to other distributions as well? Who knows, I hope so because I see that at least for what concerns building and packaging, mono-debugger has a huge space for improvement, and I’d like to help out with that, if they allow me.

Post scriptum: I was also able to make mono-debugger use the system libedit, the result is less spectacular than using system libbfd, but it’s still nice:

flame@yamato mono-debugger-2.4.2 % qsize mono-debugger
dev-util/mono-debugger-2.4.2: 25 files, 21 non-files, 2021.133 KB
flame@yamato mono-debugger-2.4.2 % qsize mono-debugger
dev-util/mono-debugger-2.4.2: 25 files, 21 non-files, 1561.300 KB

Now if only I could get it to work …