This Time Self-Hosted
dark mode light mode Search

The debugged debugger — part 2

So after my last night’s post I finally found the problem.

Actually, my mixing in the new system libbfd sidetracked me for about an hour, because the same symptoms were caused by an API change that I didn’t maintain correctly; after that I was able to use both system and internal libbfd with the same exact results.

I started adding printing checkpoints within both in the C# Bfd wrapper and in the C glue code that called into libbfd; it’s not really an easy thing, because, well, libbfd is probably one of the most over-engineered libraries I have ever seen. It really provides a lot information for a lot of different executable and binary formats, but to do that it increases tremendously the complexity; indeed that’s one of the reasons why gold is much faster than standard ld and why I preferred to write my own Ruby-Elf rather than binding the Bfd interface and build up from that (which could have been more complete under a few circumstances).

At any rate, I was lucky to have enough knowledge about ELF files to identify the issue at the end, most people who wouldn’t have seen ELF would have given up along the way. At the end I cut down the chase to noticing that it was trying to load the symbol table (.symtab, which includes internal local symbols — symbols marked static and thus not exported), and found none. Since it wouldn’t be able to find any symbol you’d be surprised if it were to actually match the nptl_version variable I talked about yesterday.

Going down on that line, it turned out that, albeit Mono splits debug symbols in a different file (.mdb), mdb does not support the feature that allows to do that with ELF files: our splitdebug. I actually was wondering if that was the problem from the start, but then I ruled it out because Fedora also uses the same feature, and there mono-debugger starts fine. I now replaced “work fine” with “starts fine” as you’ll see in a moment.

So if mdb does not support split debug files, how on earth can it work on Fedora? Well, the symbol it’s trying (and failing) to identify here is nptl_version from a quick check on the laptop told me that Fedora does not strip .symtab from! I was actually afraid that Fedora weren’t stripping .symtab at all, but then I started using the /usr/bin/mono object as a reference, and there you cannot find the .symtab section at all: Fedora has a special case for libpthread.

Now, the quick solution would be of course to just not strip of its .symtab either, so that mdb could start properly; the problem with that solution is that you wouldn’t be able to get backtrace or anything else out of the unmanaged code because it wouldn’t be loading that at all. On distributions that use split debug (Gentoo if requested, Fedora, and I have no idea what else), mono-debugger would start, if has .symtab, but it won’t work with any object that has .symtab on the debug file; which is our case. So I’ll try to find time to actually fix it in mono-debugger; because it is a bug in mono-debugger, or maybe a missing feature, not a problem with “roll your own optimization flags” as Miguel wanted it to be.

Maybe this will convince them that maybe they should try to give credit to other distributions as well? Who knows, I hope so because I see that at least for what concerns building and packaging, mono-debugger has a huge space for improvement, and I’d like to help out with that, if they allow me.

Post scriptum: I was also able to make mono-debugger use the system libedit, the result is less spectacular than using system libbfd, but it’s still nice:

flame@yamato mono-debugger-2.4.2 % qsize mono-debugger
dev-util/mono-debugger-2.4.2: 25 files, 21 non-files, 2021.133 KB
flame@yamato mono-debugger-2.4.2 % qsize mono-debugger
dev-util/mono-debugger-2.4.2: 25 files, 21 non-files, 1561.300 KB

Now if only I could get it to work …

Comments 6
  1. Been a non-dev skim-reader for a while, but this post makes me want to say:This kind of effort to get the job done the right way makes me proud.

  2. mono has a “funny” build system anyway.I’m not talking about so called managed part,as I can’t really tell anything about that,cause I don’t really know the language.It’s both autotools part and some of the C codethat bothers me.In Gentoo, one thing that anybody on x86 probablynoticed is the executable stack QA.It does not appear on amd64, cause it has theproper fix – ‘progbits’ bit in an assembly file.Same would work for the file used on x86,but when I asked a question about it, the answerwas IMHO bit off-topic and I think incorrect.Also, there’s a bit in their, that looksbit strange and won’t work at all shall they moveto libtool 2 (and I’m not talking about the thing,that could be “fixed” by LT_OUTPUT), not thatthey will, seeing as they adopted that dolt thingy.They seem also unwilling to try to see if somethingcan be done about their current need for’-fno-strict-aliasing’ – as far as I tried, outlookon fixing that looks pretty good, though I didstumble on a few code blocks, somebodyknowing mono internals a bit better could have done it.

  3. L’ho già detto che dovrebbero clonarti? 🙂 Mi pare di si.In ogni caso, dovresti venir a insegnare al politecnico di Milano, sarei molto grato se le tasse che pago, andassero a finanziare cose come un tuo eventuale corso ;-)Mai pensato di entrar nel mondo accademico??

  4. Nicola, purtroppo col mondo accademico ho un certo conflitto.. non son riuscito manco a dare un esame qua a Venezia, son scappato mooolto prima.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.