GNU GRUB 2 Debugging with GDB HOWTO

Ľubomír Rintel <lkundrak@v3.sk>

Self-standing programs such as operating system kernels or boot loaders tend to be generally harder to debug than user-mode applications, because environment they run it is not very friendly to debuggers. One possibility how to debug such programs is remote debuging over ethernet or serial line, requiring the debuggee to cooperate with debugger. Second possibility is using a machine emulator software, such as QEMU or Bochs.

This document covers both debugging over a serial line and debugging using an emulator. I know no non-i386 emulator that is capable of running GRUB 2 and the remote stub exists only for i386, so i'll assume i386 port of GRUB 2.

What will we need

Preparing the source

For debugging purposes we will need the build system to leave modules with full symbol table and debugging information. We will also have to define some symbols GDB expects to be defined, such as main.

So unpack the GRUB 2 source code, grab the grub2-gdb.diff patch and apply it.

grub2$ patch -p1 <grub2-gdb.diff
grub2$

Building with debugging information

Default C compiler flags GRUB 2 is built with are -g -O2. This is generally enough for us. In some cases, the -O2 optimization may cause the code behave differently from what you expect. If this happens override the CFLAGS, reconfigure and rebuild your source.

grub2$ ./configure CFLAGS=-g
...
grub2$ gmake
...
grub2$

After successful build, you should have a kernel.exec and a couple of *.elf files. The kernel.exec file is the GRUB 2 kernel, that is, among other things, the module loader and linker. Each *.elf file is a module with full symbol table and debugging information. If you don't see *.elf files then probably your Ruby interpreter wasn't detected and you should investigate config.log for more information.

Debugging using emulator

Preparing the environment

Now you should create the bootable GRUB 2 image for use with QEMU. There are many different ways, how to do it. I'll describe, how do I do it below.

Provided you don't want to debug the module loading code, but want to debug just on particular module, include it in the core image. (In another case you'll have to create a floppy image with all the *.mod files.) So let's say we're going to debug the hello module.

grub2$ grub-mkimage -d. -o /tftproot/core.img hello
grub2$

I don't want to create the GRUB 2 boot floppy image each time I recompile GRUB 2, so I bootstrap the core image with GRUB Legacy, loading it with tftp protocol. For this, you need GRUB Legacy with Realtek 8029 NIC support (as this is PCI ne2000 clone, emulated by both QEMU and Bochs). Alternatively, you might use the grub-0.97-net.floppy.gz image.

Legal note: To comply with paragraph 3b of the GNU General Public License terms, here's the source code of GRUB Legacy that was used to generate the floppy: grub-0.97.tar.gz. You can also freely download it from any of the GNU mirrors.

Preparing and running GRUB 2 in QEMU

If QEMU is the emulator of your choice, you are ready to run it now. The important thing is to pass it a -s parameter, so it will act as remote GDB server. If you followed above directions, you will also want QEMU to enable its builtin tftp server. In that case add a -tftp /tftproot option.

$ qemu -boot a -fda grub-0.97-net.floppy -tftp /tftproot -s

If you boot GRUB 2 directly (i.e. without use of GRUB Legacy), and you want to debug some initialization code, you might want to stop the emulated CPU until you connect the debugger. In that case, use -S option.

The second alternative: Bochs

Bochs is another excellent i386 emulator. It is somewhat slower than QEMU because it doesn't use dynamic translation, but is still great for debugging purposes. It contains both built-in debugger and remote GDB debugging stub. As the title of this document says, we'll cover just the GDB part.

First of all, ensure that your Bochs was build with remote GDB stub. When you build it from source pass at least --enable-gdb-stub argument to its configure script. A network card emulation might also be needed, in case you decide to load the GRUB 2 kernel using TFTP protocol. Then the right arguments are --enable-ne2000, --enable-pci and --enable-pnic.

The next thing you need to do is to create Bochs' configuration file, .bochsrc, or use this one:

# Memory
megs: 8
# Floppy with GRUB Legacy
floppya: 1_44=grub-0.97-net.floppy, status=inserted
# Attach ne2000 to PCI bus, so that GRUB Legacy detects it automatically
i440fxsupport: enabled=1, slot1=ne2k
# Enable the ne2000 NIC and the builtin TFTP server
ne2k: ioaddr=0x240, irq=9, mac=b0:c4:20:00:00:01, ethmod=vnet, ethdev="/"
# Enable the GDB remote stub
gdbstub: enabled=1
Now you can launch the emulator:
$ bochs -q
...
Waiting for gdb connection on localhost:1234
Screenshot of Bochs and GDB

Great. The emulated CPU is stopped and Bochs listens for GDB connections, so you can attach the debugger. Please note, that Bochs no longer listens once you disconnect the GDB. This means, that you have to restart Bochs when you want to reattach, you won't do this often anyways.

Debugging over serial line

Motivation

There are many reasons why would you wish to debug GRUB 2 running on real hardware. You might want do debug routines that use hardware which is emulated neither by QEMU no Bochs, say driver for non-ne2000 network adapter. In cases like this, you need to use remote GDB connected to a testbed machine with GRUB 2 that has GDB stub over a serial line. I also recommend you to connect both machines with ethernet, so that you can transfer GRUB 2 kernel from the development machine to the testbed machine easily -- over TFTP.

Debugging environment setup

Buy, borrow, steal or make yourself DTE-DTE serial cable (frequently referred to as null-modem). If you decide to make it, search Google for a wire map or use one below this paragraph. Connect RS-232 interfaces of both development machine and testbed one together.

Null modem wire map

If you want to use TFTP to transfer the GRUB 2 kernel, you need to enable TFTP server on the development machine and compile GRUB Legacy with support for ethernet adapter of your testbed machine. If your operating system uses BSD-style network daemons add (or uncomment) something like follwing to /etc/inetd.conf:

tftp dgram udp wait root /usr/libexec/tftpd tftpd /tftproot
Reload the inetd configuration afterwards:
# pkill -HUP inetd

The debugging stub is not a part of the GNU GRUB distribution, so you have to patch the sources with the grub2-gdb-stub.diff patch. Please note that as this patch extends the uncompressable part of the kernel, you have to use grub-mkimage built from the patched source tree to generate core images. Be sure to include the gdb module when building the core image.

grub2$ patch -p1 <grub2-gdb-stub.diff
grub2$ ./configure && gmake
...
grub2$ grub-mkimage -d. -o /tftproot/core.img gdb

Compile GRUB Legacy with support for your testbed machine's ethernet adapter and install it on the testbed machine. To load GRUB 2 kernel from the development machine use theese commands (with addresses valid in your network):

(grub) ifconfig --address=192.168.1.2 --server=192.168.1.1
...
(grub) root (nd)
...
(grub) kernel /tftproot/core.img
...
(grub) boot
...

Now, after you started GRUB 2, configure the serial port with serial command. It is probably a good idea to use the highest available baud rate instead of the default 9600 bps. When serial port is configured you can start listening for GDB connection with break command. GDB stub will also be passed control when any CPU exception occurs (division by zero, access beyond available memory range, etc.).

(grub) serial --speed=115200
(grub) break
Now connect the remote debugger, please.

Random things to note

Some operations can be really slow. For example load_all_modules with 9 modules loaded take around a minute and a half. Probably GDB's set remotecache on can improve performance in some cases. So if you think that your session hung, enable set remotedebug on and watch what's going on.

It is not a good idea to set breakpoints to either serial port communication code ot the gdb stub code itself. You know why.

The GDB part

Connecting the debugger and loading debugging symbols

Now you will need to place files .gdbinit and gmodule.pl in your working, GRUB 2 source directory. When you launch GDB now, the commands .gdbinit will cause it to load symbols from kernel.exec and also cause it to add corresponding symbols each time a module is loaded. I strongly recommend you to at least skim through it, to know and understand what are you doing.

Connect to either running emulator or testbed machine using GDB's target. The emulator likely listens on local TCP port 1234, so for it the right command is

(gdb) target remote :1234
...

If you use a serial connection to connect to GRUB, use the name of your serial device as a parameter to target remote. If you're using Linux, it's it's probably something like /dev/ttyS0. In case your operating system provides SunOS compatible serial device, use that one -- the other one might block, waiting for modem control lines signalize connection, what will never happen with some simpler null-modem cables. For example, you'd use /dev/dty00 instead of /dev/tty00 in NetBSD. You may also need to set the serial port baud rate if you're not satisfied with 9600 bps default.

(gdb) remotebaud 115200
(gdb) target remote /dev/dty00
...

Now the debugger is in control. If you connected to a running GRUB 2 with loaded modules you will want to load the corresponding symbol files. Use load_all_modules macro.

(gdb) load_all_modules 
add symbol table from file "normal.elf" at
        .text_addr = 0x971c0
        .rodata_addr = 0x96190
        .data_addr = 0x96150
        .bss_addr = 0x95a60
add symbol table from file "hello.elf" at
        .text_addr = 0x943c0
        .rodata_addr = 0x94370
(gdb)

Please do not use this macro before the grub_dl_head is initialized. This basically means "before GRUB 2 is loaded" or "before .bss is zeroed".

Now you can do whatever you want, i.e. insert breakpoints, step through the code, inspect contents of variables, do stack traces and whatever. All debbuging information for modules you load from the point where you connect the GDB to the emulator will be loaded automatically, as you might ensure with info break command.

(gdb) info breakpoints 
Num Type           Disp Enb Address    What
1   breakpoint     keep y   0x00009d19 in grub_dl_add at kern/dl.c:73
        silent
        load_module mod
        cont
(gdb)

Don't get confused by strange behavior of GDB when the a BIOS call is executing in real mode (for example waiting for a keyboard input or doing a disk I/O). And don't try to do backtraces -- ESP points to a different stack.

With GDB's continue command you resume execution and return the control to the emulator. If you are using QEMU and you want to gain control again (without hitting a breakpoint), switch to QEMU's monitor (Ctrl+Alt+2) and issue a 'stop' command. Interrupting GDB from its terminal (with Ctrl+C) will also work for both debugger. If you need to interrupt GRUB 2 running on real hardware, you have to use break command.

Debugging with DDD

Of course you can use any of numerous GDB frontends. I myself use GNU DDD, and I have to make you aware of what not to do there. At some occassions DDD runs where command that does a backtrace. One of those occassions is our load_all_modules macro. You should really avoid this in real mode, as I noted before, or your GDB will enter and endless loop. The easiest way is to never use load_all_modules, but connect to QEMU before any modules are loaded and let their symbols be loaded automatically. See QEMU's -S flag.

When you have to connect to a running QEMU that is making a BIOS call, do not run load_all_modules directly, but leave real mode first. For example, set a breakpoint at some point that executes in protected mode, continue till that point and then execute the macro. I haven't ever done this, but it should work. If your DDD seems to freeze, check ~/.ddd/log to ensure that it is not doing an endless backtrace or whatever.

DDD screenshot

Furure revisions & TODO

This is revision $Revision: 3.4 $. Any suggestions for improvements, fixes, etc. are welcome. I'll try to extend this document further to cover some other GDB frontends, and possible problems with them.


Last Change: $Date: 2008/12/30 10:04:40 $