# Tutorial 08 - Hardware Debugging using JTAG ## tl;dr In the exact order as listed: 1. `make jtagboot` and keep terminal open. 2. Connect USB serial device. 3. Connect `JTAG` debugger USB device. 4. In new terminal, `make openocd` and keep terminal open. 5. In new terminal, `make gdb` or make `make gdb-opt0`. ![Demo](../doc/09_demo.gif) ## Table of Contents - [Introduction](#introduction) - [Outline](#outline) - [Software Setup](#software-setup) - [Hardware Setup](#hardware-setup) * [Wiring](#wiring) - [Getting ready to connect](#getting-ready-to-connect) - [OpenOCD](#openocd) - [GDB](#gdb) * [Remarks](#remarks) + [Optimization](#optimization) + [GDB control](#gdb-control) - [Notes on USB connection constraints](#notes-on-usb-connection-constraints) - [Additional resources](#additional-resources) - [Acknowledgments](#acknowledgments) - [Diff to previous](#diff-to-previous) ## Introduction In the upcoming tutorials, we are going to touch sensitive areas of the RPi's SoC that can make our debugging life very hard. For example, changing the processor's `Privilege Level` or introducing `Virtual Memory`. A hardware based debugger can sometimes be the last resort when searching for a tricky bug. Especially for debugging intricate, architecture-specific HW issues, it will be handy, because in this area `QEMU` sometimes can not help, since it abstracts certain features of the HW and doesn't simulate down to the very last bit. So lets introduce `JTAG` debugging. Once set up, it will allow us to single-step through our kernel on the real HW. How cool is that?! ## Outline From kernel perspective, this tutorial is the same as the previous one. We are just wrapping infrastructure for JTAG debugging around it. ## Software Setup We need to add another line to the `config.txt` file from the SD Card: ```toml arm_64bit=1 init_uart_clock=48000000 enable_jtag_gpio=1 ``` ## Hardware Setup Unlike microcontroller boards like the `STM32F3DISCOVERY`, which is used in our WG's [Embedded Rust Book], the Raspberry Pi does not have an embedded debugger on its board. Hence, you need to buy one. For this tutorial, we will use the [ARM-USB-TINY-H] from OLIMEX. It has a standard [ARM JTAG 20 connector]. Unfortunately, the RPi does not, so we have to connect it via jumper wires. [Embedded Rust Book]: https://rust-embedded.github.io/book/start/hardware.html [ARM-USB-TINY-H]: https://www.olimex.com/Products/ARM/JTAG/ARM-USB-TINY-H [ARM JTAG 20 connector]: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0499dj/BEHEIHCE.html ### Wiring
GPIO # Name JTAG # Note Diagram
VTREF 1 to 3.3V
GND 4 to GND
22 TRST 3
26 TDI 5
27 TMS 7
25 TCK 9
23 RTCK 11
24 TDO 13

## Getting ready to connect Upon booting, thanks to the changes we made to `config.txt`, the RPi's firmware will configure the respective GPIO pins for `JTAG` functionality. What is left to do now is to pause the execution of the RPi and then connect over `JTAG`. Therefore, we add a new `Makefile` target, `make jtagboot`, which uses the `chainboot` approach to load a tiny helper binary onto the RPi that just parks the executing core into a waiting state. The helper binary is maintained separately in this repository's [X1_JTAG_boot] folder, and is a modified version of the kernel we used in our tutorials so far. [X1_JTAG_boot]: ../X1_JTAG_boot ```console $ make jtagboot Minipush 1.0 [MP] ⏳ Waiting for /dev/ttyUSB0 [MP] ✅ Serial connected [MP] 🔌 Please power the target now __ __ _ _ _ _ | \/ (_)_ _ (_) | ___ __ _ __| | | |\/| | | ' \| | |__/ _ \/ _` / _` | |_| |_|_|_||_|_|____\___/\__,_\__,_| Raspberry Pi 3 [ML] Requesting binary [MP] ⏩ Pushing 7 KiB ==========================================🦀 100% 0 KiB/s Time: 00:00:00 [ML] Loaded! Executing the payload now [ 0.394532] Parking CPU core. Please connect over JTAG now. ``` It is important to keep the USB serial connected and the terminal with the `jtagboot` open and running. When we load the actual kernel later, `UART` output will appear here. ## OpenOCD Next, we need to launch the [Open On-Chip Debugger], aka `OpenOCD` to actually connect the `JTAG`. [Open On-Chip Debugger]: http://openocd.org As always, our tutorials try to be as painless as possible regarding dev-tools, which is why we have packaged everything into the [dedicated Docker container] that is already used for chainbooting and `QEMU`. [dedicated Docker container]: ../docker/rustembedded-osdev-utils Connect the Olimex USB JTAG debugger, open a new terminal and in the same folder, type `make openocd` (in that order!). You will see some initial output: ```console $ make openocd [...] Open On-Chip Debugger 0.10.0 [...] Info : Listening on port 6666 for tcl connections Info : Listening on port 4444 for telnet connections Info : clock speed 1000 kHz Info : JTAG tap: rpi3.tap tap/device found: 0x4ba00477 (mfg: 0x23b (ARM Ltd.), part: 0xba00, ver: 0x4) Info : rpi3.core0: hardware has 6 breakpoints, 4 watchpoints Info : rpi3.core1: hardware has 6 breakpoints, 4 watchpoints Info : rpi3.core2: hardware has 6 breakpoints, 4 watchpoints Info : rpi3.core3: hardware has 6 breakpoints, 4 watchpoints Info : Listening on port 3333 for gdb connections Info : Listening on port 3334 for gdb connections Info : Listening on port 3335 for gdb connections Info : Listening on port 3336 for gdb connections ``` `OpenOCD` has detected the four cores of the RPi, and opened four network ports to which `gdb` can now connect to debug the respective core. ## GDB Finally, we need an `AArch64`-capable version of `gdb`. You guessed right, it's already packaged in the osdev container. It can be launched via `make gdb`. This Makefile target actually does a little more. It builds a special version of our kernel with debug information included. This enables `gdb` to show the `Rust` source code line we are currently debugging. It also launches `gdb` such that it already loads this debug build (`kernel_for_jtag`). We can now use the `gdb` commandline to 1. Set breakpoints in our kernel 2. Load the kernel via JTAG into memory (remember that currently, the RPi is still executing the minimal JTAG boot binary). 3. Manipulate the program counter of the RPi to start execution at our kernel's entry point. 4. Single-step through its execution. ```console $ make gdb [...] >>> target remote :3333 # Connect to OpenOCD, core0 >>> load # Load the kernel into the RPi's DRAM over JTAG. Loading section .text, size 0x2454 lma 0x80000 Loading section .rodata, size 0xa1d lma 0x82460 Loading section .got, size 0x10 lma 0x82e80 Loading section .data, size 0x20 lma 0x82e90 Start address 0x0000000000080000, load size 11937 Transfer rate: 63 KB/sec, 2984 bytes/write. >>> set $pc = 0x80000 # Set RPI's program counter to the start of the # kernel binary. >>> break main.rs:158 Breakpoint 1 at 0x8025c: file src/main.rs, line 158. >>> cont >>> step # Single-step through the kernel >>> step >>> ... ``` ### Remarks #### Optimization When debugging an OS binary, you have to make a trade-off between the granularity at which you can step through your Rust source-code and the optimization level of the generated binary. The `make` and `make gdb` targets produce a `--release` binary, which includes an optimization level of three (`-opt-level=3`). However, in this case, the compiler will inline very aggressively and pack together reads and writes where possible. As a result, it will not always be possible to hit breakpoints exactly where you want to regarding the line of source code file. For this reason, the Makefile also provides the `make gdb-opt0` target, which uses `-opt-level=0`. Hence, it will allow you to have finer debugging granularity. However, please keep in mind that when debugging code that closely deals with HW, a compiler optimization that squashes reads or writes to volatile registers can make all the difference in execution. FYI, the demo gif above has been recorded with `gdb-opt0`. #### GDB control At some point, you may reach delay loops or code that waits on user input from the serial. Here, single stepping might not be feasible or work anymore. You can jump over these roadblocks by setting other breakpoints beyond these areas, and reach them using the `cont` command. Pressing `ctrl+c` in `gdb` will stop execution of the RPi again in case you continued it without further breakpoints. ## Notes on USB connection constraints If you followed the tutorial from top to bottom, everything should be fine regarding USB connections. Still, please note that in its current form, our `Makefile` makes implicit assumptions about the naming of the connected USB devices. It expects `/dev/ttyUSB0` to be the `UART` device. Hence, please ensure the following order of connecting the devices to your box: 1. Connect the USB serial. 2. Afterwards, the Olimex debugger. This way, the host OS enumerates the devices accordingly. This has to be done only once. It is fine to disconnect and connect the serial multiple times, e.g. for kicking off different `make jtagboot` runs, while keeping the debugger connected. ## Additional resources - https://metebalci.com/blog/bare-metal-raspberry-pi-3b-jtag - https://www.suse.com/c/debugging-raspberry-pi-3-with-jtag ## Acknowledgments Thanks to [@naotaco](https://github.com/naotaco) for laying the groundwork for this tutorial. ## Diff to previous ```diff diff -uNr 07_timestamps/Cargo.toml 08_hw_debug_JTAG/Cargo.toml --- 07_timestamps/Cargo.toml +++ 08_hw_debug_JTAG/Cargo.toml @@ -1,6 +1,6 @@ [package] name = "mingo" -version = "0.7.0" +version = "0.8.0" authors = ["Andre Richter "] edition = "2021" diff -uNr 07_timestamps/Makefile 08_hw_debug_JTAG/Makefile --- 07_timestamps/Makefile +++ 08_hw_debug_JTAG/Makefile @@ -32,6 +32,8 @@ OBJDUMP_BINARY = aarch64-none-elf-objdump NM_BINARY = aarch64-none-elf-nm READELF_BINARY = aarch64-none-elf-readelf + OPENOCD_ARG = -f /openocd/tcl/interface/ftdi/olimex-arm-usb-tiny-h.cfg -f /openocd/rpi3.cfg + JTAG_BOOT_IMAGE = ../X1_JTAG_boot/jtag_boot_rpi3.img LD_SCRIPT_PATH = $(shell pwd)/src/bsp/raspberrypi RUSTC_MISC_ARGS = -C target-cpu=cortex-a53 else ifeq ($(BSP),rpi4) @@ -43,6 +45,8 @@ OBJDUMP_BINARY = aarch64-none-elf-objdump NM_BINARY = aarch64-none-elf-nm READELF_BINARY = aarch64-none-elf-readelf + OPENOCD_ARG = -f /openocd/tcl/interface/ftdi/olimex-arm-usb-tiny-h.cfg -f /openocd/rpi4.cfg + JTAG_BOOT_IMAGE = ../X1_JTAG_boot/jtag_boot_rpi4.img LD_SCRIPT_PATH = $(shell pwd)/src/bsp/raspberrypi RUSTC_MISC_ARGS = -C target-cpu=cortex-a72 endif @@ -99,18 +103,25 @@ DOCKER_CMD = docker run -t --rm -v $(shell pwd):/work/tutorial -w /work/tutorial DOCKER_CMD_INTERACT = $(DOCKER_CMD) -i DOCKER_ARG_DIR_COMMON = -v $(shell pwd)/../common:/work/common +DOCKER_ARG_DIR_JTAG = -v $(shell pwd)/../X1_JTAG_boot:/work/X1_JTAG_boot DOCKER_ARG_DEV = --privileged -v /dev:/dev +DOCKER_ARG_NET = --network host # DOCKER_IMAGE defined in include file (see top of this file). DOCKER_QEMU = $(DOCKER_CMD_INTERACT) $(DOCKER_IMAGE) DOCKER_TOOLS = $(DOCKER_CMD) $(DOCKER_IMAGE) DOCKER_TEST = $(DOCKER_CMD) $(DOCKER_ARG_DIR_COMMON) $(DOCKER_IMAGE) +DOCKER_GDB = $(DOCKER_CMD_INTERACT) $(DOCKER_ARG_NET) $(DOCKER_IMAGE) # Dockerize commands, which require USB device passthrough, only on Linux. ifeq ($(shell uname -s),Linux) DOCKER_CMD_DEV = $(DOCKER_CMD_INTERACT) $(DOCKER_ARG_DEV) DOCKER_CHAINBOOT = $(DOCKER_CMD_DEV) $(DOCKER_ARG_DIR_COMMON) $(DOCKER_IMAGE) + DOCKER_JTAGBOOT = $(DOCKER_CMD_DEV) $(DOCKER_ARG_DIR_COMMON) $(DOCKER_ARG_DIR_JTAG) $(DOCKER_IMAGE) + DOCKER_OPENOCD = $(DOCKER_CMD_DEV) $(DOCKER_ARG_NET) $(DOCKER_IMAGE) +else + DOCKER_OPENOCD = echo "Not yet supported on non-Linux systems."; \# endif @@ -215,6 +226,35 @@ +##-------------------------------------------------------------------------------------------------- +## Debugging targets +##-------------------------------------------------------------------------------------------------- +.PHONY: jtagboot openocd gdb gdb-opt0 + +##------------------------------------------------------------------------------ +## Push the JTAG boot image to the real HW target +##------------------------------------------------------------------------------ +jtagboot: + @$(DOCKER_JTAGBOOT) $(EXEC_MINIPUSH) $(DEV_SERIAL) $(JTAG_BOOT_IMAGE) + +##------------------------------------------------------------------------------ +## Start OpenOCD session +##------------------------------------------------------------------------------ +openocd: + $(call color_header, "Launching OpenOCD") + @$(DOCKER_OPENOCD) openocd $(OPENOCD_ARG) + +##------------------------------------------------------------------------------ +## Start GDB session +##------------------------------------------------------------------------------ +gdb: RUSTC_MISC_ARGS += -C debuginfo=2 +gdb-opt0: RUSTC_MISC_ARGS += -C debuginfo=2 -C opt-level=0 +gdb gdb-opt0: $(KERNEL_ELF) + $(call color_header, "Launching GDB") + @$(DOCKER_GDB) gdb-multiarch -q $(KERNEL_ELF) + + + ##-------------------------------------------------------------------------------------------------- ## Testing targets ##-------------------------------------------------------------------------------------------------- diff -uNr 07_timestamps/src/bsp/raspberrypi/driver.rs 08_hw_debug_JTAG/src/bsp/raspberrypi/driver.rs --- 07_timestamps/src/bsp/raspberrypi/driver.rs +++ 08_hw_debug_JTAG/src/bsp/raspberrypi/driver.rs @@ -57,17 +57,6 @@ /// # Safety /// /// See child function calls. -/// -/// # Note -/// -/// Using atomics here relieves us from needing to use `unsafe` for the static variable. -/// -/// On `AArch64`, which is the only implemented architecture at the time of writing this, -/// [`AtomicBool::load`] and [`AtomicBool::store`] are lowered to ordinary load and store -/// instructions. They are therefore safe to use even with MMU + caching deactivated. -/// -/// [`AtomicBool::load`]: core::sync::atomic::AtomicBool::load -/// [`AtomicBool::store`]: core::sync::atomic::AtomicBool::store pub unsafe fn init() -> Result<(), &'static str> { static INIT_DONE: AtomicBool = AtomicBool::new(false); if INIT_DONE.load(Ordering::Relaxed) { ```