Remote debugging with GDB; part 2: GDB

When I first wrote the original part 1 of the tutorial I was working on a job project for a customer of mine; time passed without me writing the second part, and this actually went this way for over an year. Talk about long-term commitment. On the other hand, since this is going to be used almost literally for documentation for the same customer (different project, though), I’m resuming the work writing this.

On the bright side, at least one of the Gentoo changes that I have described as desired in the previous post is now solved: we have a dev-util/gdbserver ebuild in the tree (maintained by me, written because of the current project, so can be said to be contributed by the current customer of mine). So from the device side, you only need to install that to have it ready for debugging.

On the build-machine side, you need the cross-debugger for the target architecture and system; you can install it with crossdev like you do for the cross-linker and cross-compiler, but with a twist: for GDB 7.0 or later you need to install with expat support (because it seems like they now use XML as a protocol to talk between the client/master and the server/slave). So for instance you run this command: USE=expat crossdev -t $CHOST --ex-gdb.

If you’re not using Gentoo on either side, then you’re going to have to deal with both builds by hand… tough luck. On the other hand, the gdbserver executable is very small in size, on the latest version it’s just a bit over 100KiB, so building it takes very little time and takes up very little space even on embedded Linux systems.

On the remote system, there are three modes of operation: single program, attach to running process, and multi-execution daemon; the latter is my favourite mode of operations, as you just need to start gdbserver like a standard Unix daemon, waiting for incoming requests and handling them one by one as long as it’s needed (yes, I guess it would make sense to have an init script for gdbserver when used on not-too-embedded systems). In all three cases, it needs a communication media to work; on modern, and less embedded, systems that media is almost always a TCP/IP connection (just give it a ip:port address format), but the software supports using serial ports for that task as well.

The executable for gdbserver in my system is just 72KiB, so it’s very quick to upload it to the target, even if it’s an embedded Linux system. After uploading, you have to start it on the target; you can either start a program directly, attach to a process already running or, my favourite, use it for multiple debugging at once. In this latter mode, gdbserver acts a lot like a standard Unix daemon, waiting for incoming requests and handling them one by one for as long as it’s needed. This actually makes me consider the idea of setting up an init script to use for debug and test systems.

To start the daemon-like instance, just use the --multi option:

arm # gdbserver --multi

Now you can connect to the server through the cross-debugger built earlier:

% arm-linux-gnu-gdb
(gdb) target extended-remote

This drops us inside the server, or rather inside the target board, ready to debug. So let’s upload this crashy program:

% cat crash.c

int main() {
        char *str = NULL;
        return strcmp(str, "foo");
% arm-linux-gnu-gcc -ggdb crash.c -o crash
% arm-linux-gnu-strip crash -o crash_s
% wput crash_s ...

At this point you can note one very important point: we’re building the the program with the -ggdb option to produce the debug information for the program, so that GDB can tell which variable is which and which address correspond to which function; this is very important to have meaningful backtrace and is even more important if you plan on using further features including, for instance, break- and watch-points. You could technically even use -g3 to embed the source files into the final executable, which is particularly useful if you plan on having multiple firmwares around, but that’s not always working correctly so leave it as that for now.

But even if we need the debug information in the compiled executable object, we don’t need the (bigger) binary to be present on our target system; the reason is that the debug information is only composed of extra sections, and does not add or subtract any code or information to the runtime-executable code. This is why -ggdb does not make your system slower, and why stripping an executable of its debug information is enough to make it slimmer. At the same time, the extra sections added by -ggdb and stripped by strip are never going to be mapped from disk into memory anyway, at least when executing them with, so there is no extra memory usage caused by having non-stripped binaries. The only difference is in the size of the file, which might still be something if you’re parsing it for some reason, and might fragment it into multiple pieces.

Anyway, since the debug information, as just stated, is not needed, nor used at all, at runtime, there is no reason why the information has to stay on the target: the gdbserver program is just going to execute whatever the master GDB instance will ask it to, and has no reason to have a clue about the executable file or its debugging information. So you can just copy a stripped version of the file and upload it on the target; you can then use it normally, or run it through gdbserver.

After uploading the file you need to set up GDB to run the correct executable, and to load the symbols from the local copy of it:

(gdb) symbol-file crash
(gdb) set remote exec-file /uploaded/to/crash_s

and you’re completely set. You can now use the standard GDB commands (break, run, handle, x, kill, …) like you were running the debugger on the target itself! The gdbserver program will take the action, and the main GDB instance will direct it as per your instructions.

Have fun remote debugging your code!

Splitting packages: when to bump?

One of the reasons why I think that splitting packages is a much higher overhead than it’s worth, is the version and revision bumping, and in particular, deciding if it has to happen or not. It’s not only limited to patching but also to new version releases, most of the time.

The problem with patching occurs when you have to apply a patch that spans different split packages: if it changes the internal interface between components, you’ve now got to bump the requirements of the internal packages so that they require the new version; if there are circular dependencies in there, you’re definitely messed up. And this also requires you to cut down the patch into multiple parts (and sometimes, apply the same parts at both sides).

The problem with bumping instead is somewhat simpler: when you have a package that is shipped monolithic, but quite separated logically, it’s not rare for upstream to release a new version that only fixes some of the internal “sub-packages”; this is what happens most of the time with alsa-tools and alsa-plugins (the former even more as each tool has its own configure script, instead of a single big one). In this cases, the split packages might not have to be bumped at all, if the main one is bumped. And this is quite a bit of a burden for the (Gentoo) packagers. You cannot just rely on >=${PV} dependencies (as they might not always be satisfied), and you shouldn’t bump it unnecessarily (users don’t like to rebuild the same code over and over again).

In particular, if you end up having the same version bumps for both even if the code hasn’t changed (or you still have to always rebuild them at every revision bump), then you are just making it more complex than it should be: regroup the package in a single ebuild! That is, unless upstream decides to break up the code themselves, like dbus did. If your problem is providing proper dependencies (as it seems it happened with poppler), then your problem is solved by the (now-available) USE dependencies, rather than by splitting to hair-thin packages and increasing the number of “virtual” ebuilds. The same applies to gtk-sharp (and yes, I know both were done by Peter, and yes he knows I don’t like that solution).

Right now, I maintain one split package (by the standard notion of split), and one that comes very near for our purpose: gdbserver and perf. The former is an almost-standalone sub-package of gdb itself (the GNU debugger), which ships with the same tarball, the latter is part of the Linux kernel source tarball (and patches), but is not tied to the rest of the source either.

In the first case, deciding whether to bump or not is quite simple: you extract the gdbserver sources from the old and new tarball, then diff them together. If there is no change, you ignore the bump (which is what I have done with the recent 7.0.1 release). It’s a bit boring but since the gdb sources aren’t excessively huge it’s not too bad.

It’s a different matter for perf: since it’s shipped within the kernel itself, any new release candidate of the kernel is a possible release candidate for perf as well! Luckily I can usually just rely on the release changelog and grep for perf: in front of the git log. It might not be the best choice, as it’s error-prone, but unpacking (and patching, in case of release candidates) the Linux sources that are needed for perf is a non-trivial task by itself, and it takes much more time than it would be worth.

The frustration of debugging

I’m currently working on a replacement for bufferpool in lscube; the bufferpool library has provided up to now two different (quite different actually) functionalities to the lscube suite of software: it provided an IPC method by using shared memory buffers, and also a producer/consumer system for live or multicast streaming. Unfortunately, by the way it was developed in the first place, it is really fragile, and is probably one of the least understood parts of the code.

After trying to look to improve it I decided that it’s easier to discard it and replace it altogether. Like many other parts, it has been written by many people at different times, and this could be seen quite well. I think the one part that shows very well that it has been written with too little knowledge of programming details is that the actual payload area of the buffers is not only fixed-sized, but of a size (2000 bytes) that does not falls into the usual aligned data copy. Additionally, while the name sounded like it was quite generic, the implementation certainly wasn’t, since it kept some RTP-related details directly into the transparent object structures. Not good at all.

At any rate, I decided it was time to replace the thing and started looking into it, and designing a new interface. My idea was to build something generic enough to be reusable in other places, even in software completely different, but at the same time I didn’t feel like going all the way to follow the GObject module since that was way too much for what we needed. I started thinking about a design with one, then many, asynchronous queues, and decided to try that road. But since I like thinking of me as a decent developer, I started writing the documentation before the code. Writing down the documentation has actually shown me that my approach would not have worked well at all; after a few iterations over just the documentation of how the thing was supposed to work, I was able to get one setup that looked promising, and started implementing it.

Unfortunately, right after implementing it and replacing the old code with the new one, the thing wasn’t working; I’m still not sure now why it’s not working but I’ll go back to that in a moment. One other thing I would like to say though is that after writing the code, and deciding it might have been something I did overlook in the term of implementation, I simply had to look again at the documentation I wrote, as well as looking for “todo” markup inside the source code (thanks Doxygen!), to implement what I didn’t implement the first time around (but I decided was needed beforehand). So as a suggestion to everybody: keep documenting as you write code, is a good practice.

But, right now, I’m still unsure of what the problem is; it would be quite easy to just find it if I could watch at the code as it was executed, but it seems like the GNU debugger (gdb) is not willing to collaborate today. I start feng inside it, set a breakpoint on the consumer hook-up, and launch it, but when it actually stops, info threads shows me nothing, although at that point there are, for sure, at least three threads: the producer (the file parser), the consumer (the rtsp session) and the main loop. The funny thing is that the problem is certainly in my system, because the same exact source code does work fine for Luca. I’ll have to use his system to debug, or set up another system for pure debugging purposes.

This is the second big problem with gdb today, the first happened when I wanted to try gwibber (as provided by Jürgen); somehow it’s making the gnome-keyring-daemon crash, but if I try to hook gdb to that, and break it on the abort() call (the problem is likely an assertion that fails), it’s gdb itself that crashes, disallowing me from reading a backtrace.

I have to say, I’m not really happy with the debugging facilities on my system today, not at all. I could try valgrind, but last time I used it, it failed because my version of glib is using some SSE4 instruction it didn’t know about (for that reason I use a 9999 version of valgrind, and yet it doesn’t usually work either). I’m afraid at the end I’ll have to rely on adding debug output directly to the bufferqueue code and hope to catch what the problem is.

Current mood: frustrated.

Remote debugging with GDB; part 1: Gentoo

In the last couple of days I had to focus my attention on just one issue, a work issue which I’m not going to discuss here since it wouldn’t make much sense either way. But this brought me to think about one particular point: remote debugging, and how to make it easier with Gentoo.

Remote debugging is a method commonly used when you have to debug issues on small, embedded systems, where running a full blown debugger is out of question. With GDB, this consists of running on the target system a tiny gdbserver proces, which can be controlled by a master gdb instance via either serial port or TCP connection. The advantages don’t stop at not needing to run the full blown debugger: the target system may also be equipped with stripped programs with no debug information, and keep the debug information entirely on the system where the master gdb instance is being executed.

This is all fine and dandy for embedded systems but it also does not stop to this, you can easily make use of this technique to debug issues on remote servers, where you might not want to upload all the debug symbols for the software causing the trouble, it then becomes a nice Swiss Army knife for system administrators.

There are a few issues you got to work with when you use gdbserver though, some of which I think should be improved in Gentoo itself. First of all, to debug a remote system that is running a different architecture than yours, you need a cross-debugger; luckily crossdev can build that for you, then you need the actual cross-built gdbserver. Unfortunately, even though the server is small and self-sufficient, it is currently only being built by sys-devel/gdb which is not so nice for embedded systems; we’d need a minimal or server-only USE flag for that package or even better a dev-util/gdbserver standalone package so that it could be cross-built and installed without building the full blown debugger which is not useful at all.

Then there is the problem of debug information. In Gentoo we already provide some useful support for that through the splitdebug feature, which takes all the debug information from an ELF file, executable or library, and splits it out in a .debug file that only contains the information useful during debugging. This split does not really help much on a desktop system, since the debug information wouldn’t be loaded anyway, my reasoning to have it separated is to make sure I can drop them all at once if I’m very short on space, without breaking my system. On the other hand, it is very useful to have it around for embedded systems for instance, although it currently is a bit clumsy to use.

Right now one common way to achieve proper archiving of debug information and stripping them in production is using the buildpkg feature together with splitdebug, and set up an INSTALL_MASK variable for /usr/lib/debug when doing the build of the root. The alternative is to simply remove that directory before preparing the tarball of the rootfs or stuff like that. This works decently, since the binary packages will have the full debug information, and you’d just need to reinstall the package you need debug information for without the INSTALL_MASK. Unfortunately this will end up replacing the files from the rest of the package, which is not so nice because it might change the timestamps on the filesystem, as well as wasting time, and eventually flash too.

This also does not play entirely nice with remote server administration. The server where this blog is hosted is a Gentoo vserver guest, it was installed starting from a standard amd64 stage, then I built a local chroot starting from the same stage, setting it up exactly as I wanted it to be; finally, I synced over the Portage configuration files, the tree and the binary packages built of all it, and installed them. The remote copy of the packages archive is bigger than the actual used packages, since it contains also the packages that are just build dependencies, but the overhead of this I can ignore without too much problems. On the other hand, if I were to package in all the debug information, and just avoid installing it with INSTALL_MASK, the overhead wouldn’t be this ignorable.

My ideal solution for this would involve making Portage more aware of the splitdebug feature, and actually split it out on package level too, similarly to what RPM does with the -dbg package. By creating a -debug or -dbg binpkg to the side of each package that would otherwise have /usr/lib/debug in it, and giving the user an option on whether to install the sub-package, it would be possible to know whether to merge on the root filesystem the debug information or ot, without using INSTALL_MASK. Additionally, having a common suffix for these packages would allow me to just ignore them when syncing them to the remote server, removing the overhead.

Dreaming a bit more, it could be possible to design multiple sub-package automatic generation, to resemble a bit what binary distributions like Debian and RedHat have been doing all these years, to split documentation in its -doc package, the headers and static libraries in -dev and so on. Then it would just require to give the user ability to choose which subpackages to install by default, and a per-package override. A normal desktop Gentoo system would probably want to have everything installed by default, but if you’re deploying Gentoo-based systems, you probably would just have a chroot on a build machine that does the work, and then the system would just get the subset needed (with or without documentation). Maybe it’s not going to be easy to implement, and I’m sure it’s going to be controversial, but I think it might be worth looking into it. Implementing it in a non-disruptive way (with respect to the average user and developer workflow) is probably going to make it feasible.

Tomorrow, hopefully, I’ll be writing some more distribution-agnostic instructions on how to remotely debug applications using GDB.

And what about imported libraries?

Following the previous blog here also a list of projects that seem to like importing libraries, causing code duplication even for code that was designed to be shared.

  • cdrkit, again, contains a stripped down version of libdvdread, added, of course, by our beloved Jörg Schilling; bug #206939; additionally it contains a copy of cdparanoia code; bug #207029

  • ImageMagick comes with a copy of libltdl; bug #206937

  • not even KDE4 seems to have helped libkcal which even in its newest incarnation ships with an internal copy of libical, causing me to have three copies of it installed in my system;

  • libvncserver comes with a copy of liblzo2; actually there are two, one in libvncserver and one in libvncclient; even the source files are duplicated!; bug #206941

  • SDL_sound, Wine and LAME seem to share some mp3 decoding code, which seems to come originally from mpg123;

  • cmake couldn’t stay out of this, it comes with a copy of libform (which is part of ncurses); follow bug #206920

  • I’m not sure what it is, but DigiKam, Numeric (for Python) and numpy have a few functions in common; the latter seems to have even more than that in common; bug #206931 per Numeric and numpy, and bug #206934 for DigiKam.

  • ghostscript comes with internal copies of zlib, libpng, jpeg and jasper; unfortunately jasper is also modified, for the other three there’s bug #206893; by the way, the copies are present in both the gs command and in the libgs library;

  • OpenOffice comes with loads of duplicated libraries; in particular, it comes with its own copy of icu libraries; see on bug #206889

  • TiMidity++ comes with a copy of libmikmod; bug #206943

  • Korundum for KDE3 has a copy of qtruby embedded, somehow; I wonder if it isn’t a fluke of our buildsystem; bug #206936

  • gdb contains an internal copy of readline; –bug #206947

  • tork contains a copy of some functions coming from readline; bug #206953

  • KTorrent contains a copy of GeoIP (and to think I removed the one in TorK as soon as I’ve spotted it); bug 206957

  • both ruby and php use an internal copy of – I think – oniguruma; I haven’t looked if it’s possible to add that as a system library and then use it; bug #206963

  • MPlayer seems to carry a copy of libogg together with tremor support; bug #206965

  • pkg-config ships with an internal copy of glib; bug #206966

  • tor has an internal copy of libevent’s async dns support; funny, as it links to libevent; bug #206969

  • gettext fails to find the system copy of libxml2, falling back to use the internal copy; at least it has the decency of using a proper commodity library; bug #207018

  • both Perl and Ruby have a default extension based on SDBM, a NDBM workalike; there seems not to be a shared version of it, so they just build the single source file in their own extensions directly, without hiding the symbols; beside the code re-use not being available, if a process loads both libperl and libruby, and in turn they load their sdbm extension, stuff’s gonna hurt;

  • enchant has an internal copy of Hunspell; probably due to the fact that old Hunspell built only static non-PIC libraries, and enchant uses plugins; bug #207025; upstream fixed this in their Subversion repository already;

  • gnome-vfs contains an internal copy of neon; funny as it depends on neon already, in the ebuild; bug #207031

  • KOffice’s Karbon contains an internal copy of gdk-pixbuf; bug #209561;

  • kdegraphics’s KViewShell contains an internal copy of djvulibre; bug #209565;

  • doxygen contains internal copies of zlib and libpng; bug #210237 ; this time I used a different method to identify it as doxygen does not export the symbols;

  • rsync contains an internal copy of zlib; bug #210244 ;

Unfortunately making sure that what I’m reading is true data and not false positive, looking at the output of my script, becomes more difficult now for the presence of multiple Sun JDK versions; I have to add support for alternatives, so that different libraries implementing the same interface don’t show up as colliding (they are that way by design).

Dumping down

So, first of all, a thanks to 8an that commented on my previous post suggesting me to limit the dimension of the memory to get a dump that is small enough to enter my current swap partition without need to buy an extra CF card. This solution didn’t pass in my mind before, so I used that one to start testing 🙂

Now, as Roy seems to have taken Daniel’s (dsd) place as drunken brit, I’ve looked up how to hook savecore(8) and dumpon(8) admin utilities into localmount, and prepared a bug report that contains the patch and a configuration file. I tested it out and it works nicely, the good part is that you can easily enable dump on a per-boot basis rather than having it always enabled by using dumpon directly, it will still save your core file if found.

So for the baselayout part I’m gold, the problem does not arise at all. More difficult is to get kgdb to work. As I supposed, there’s no way on earth that Mike will install libgdb for me (as upstream don’t seem to like that approach anyway, if I read correctly the documentation), so either I build a package that takes the FreeBSD sources and GDB itself, and builds a copy of GDB just to build kgdb(1),, which is not practical, not counting difficult to maintain over a long period (for instance, which versioning scheme should I use? GDB’s? FreeBSD’s?).

The only other thing that I’m left with is to fork kgdb, and try to make it a frontend to gdb itself, not by using the library calls, but by using the commandline interface of gdb, and commanding it from outside.

It might as well work, although I’ll have to talk with someone from FreeBSD as I doubt I would be able to keep it in sync alone. I see that obrien committed to at least two files in the 6.2_rc2 release, and he’s a nice guy, so I might have some hope for that 🙂

So I have to add “Make kgdb work as GDB master” to my TODO list, although I hope to find the cause of the misalignment before that time.

Debugging debuggable

Now that Prakesh was able to complete the build of the three stages for Gentoo/FreeBSD 6.2_rc2, and they are available on mirrors, I have a few things to take care of in Gentoo/FreeBSD that I overlook for too long time.

The first is for sure updating the documentation, so that new suers can install the 6.2 stages fine, without all the workarounds we used to have for 6.1 (because it wasn’t built with catalyst); done that, I have to deprecate 6.1 in favour of 6.2, as that version is pretty much where we’re focusing right now, with the libedit fixes and the new baselayout 1.13 (that Roy made perfect on FreeBSD!); and then there’s to fix the modules loading problem with SPARC64.

So, let’s start with the first step, I’ve asked jforman to remove the 6.1 stage from the mirror, so that there won’t be new installation of it. Later on I’ll see to write a deprecated file for 6.1 profile, with some short instructions to upgrade to 6.2 somewhat smoothly.

Instead, for what concerns SPARC64, Klothos is currently helping me understanding the issue. My first task was to get on it an editor I could actually use, which meant, for me, emacs. Unfortunately, not counting the issue with gcc’s CSU object files being in a different place than standard FreeBSD (that I already worked around with the ebuild in the transition overlay), there was a nasty SIGILL while building some elisp code, and I never got around debugging it. After all it was easier than i expected: the problem was called by an inline asm() call, that called the instruction ta 3, that after a bit of googling turned up being a trap call (kinda like software interrupts in x86) that triggered some Kernel service to flush registers, which is not implemented for FreeBSD (for instance disable this too for GNUstep on FreeBSD operating system). An easy patch to make the call conditional solved the issue for me.

So I first wanted to confirm one thing, whether the problem was while building the modules or while building the kernel: if the problem was the kernel, even trying to load a module compiled by vanilla FreeBSD should cause the same panic, while if the problem was in the building of the modules, the module would have loaded without issues. I checked, and the problem happens only with our modules, even when loaded in an official kernel, which mean it’s safe to assume that the problem is building modules rather than the kernel. Which is both good and bad, because even if it limits my scope and my need to debug the kernel, it’s not like I have so much knowledge of the ELF loading to find the issue easily. I was tempted to buy Sun’s “Linker and Libraries Guide”, but not only the book is far from cheap ($49 at least), it’s not even found in Amazon (UK)’s availability.

Anyway, a quick comparison of the zlib.ko module from FreeBSD proper and Gentoo/FreeBSD shown me that the size of our own is about twice the original one (but I think it might be caused by the -ggdb3 build), and that there are more SPARC64_RELATIVE relocations, while there are no R_SPARC_32 at all in our copy.

I was looking forward for a more throughout debug tonight, but I was stopped by two incidents that are going to make my life in the next weeks harder than I expected. The first is that we don’t currently build the kernel debugger (kgdb), and we cannot easily build it (because it requires libgdb, that we currently don’t install… and I doubt I will be able to convince vapier to install it).

The second is that to get a coredump of the crash, we need to use the kernel’s dump facilities, that requires a swap partition, of at least the size of the RAM in the machine (and I don’t have one on Klothos, as it was originally built with only 128MB of memory, while now it has 1GB), and the run of some commands during boot phase, specifically savedump between the R/W mount of partitions (to save the dump) and the enabling of swap space (because that would destroy the dump), and dumpon after the swap is loaded. For the way baselayout works now, I need to change the localmount init script, but as I don’t like that solution, I’ll have to talk about this with Roy; the important thing to me is being able to enable/disable dump through conf.d files (similarly to what’s done in FreeBSD); I suppose a solution could be to use some addons and install them with one of the freebsd ebuilds, or with baselayout proper, depending on how Roy prefer).

Now, it’s not like the baselayout issue is not easily solvable, once Roy is around (he’s partying for the new year now, I suppose); but the swap size is what is going to stop me from using this feature. My only solution would be to add another compact flash card (the adapter I’m using is capable of connecting two cards already, one master and the other slave, which is kinda good for what I paid it), but it has to be at least 2GB (the ram is only 1GiB, of course, but I don’t want to start crying when I get hit by the GiB > GB thing, as I’m not sure if the CF cards are sold by the decimal GB or by the binary GB). I once again compared the prices here with the Germany’s one, and it seems I would pay 34+20 euros from there, or 89 here.. I don’t think I’ll go buying one just yet, not a big deal to buy, but I want to do some more tries without spending more money on that box, considering that I already loaded it with new (or newish for the SATA controller and disk) stuff that did cost me at least €100, box included, and it was just to debug a kernel problem…

One of the things I found difficult to grasp about SPARC asm, anyway, beside not finding a decent reference manual of it (call me crazy, I usually understand better a language by looking at its reference rather than to explanations and tutorials), is that load and store instructions seems to be written in “orig, dest” format rather than the usual “dest, orig” that I was used to under x86.. but it’s not that difficult to understand after all, most of the instructions are named after logical operations, and the ld/ldx and st/stx instructions make also easier to understand when the register is destination or origin, would have been nice to learn SPARC assembler at school rather than 8086.