yangsuli
yangsuli

Reputation: 1362

thread-aware gdb for the Linux kernel

I am using gdb attached to a serial port of a virtual machine to debug linux kernel.

I am wondering, if there is any patches/plugins which can make the gdb understand some of linux kernel's data structure and make it "thread aware"?

By that I mean under gdb I can see how many kernel threads are there, their status, and for each thread, their stack information.

Upvotes: 2

Views: 1693

Answers (4)

Trent Lloyd
Trent Lloyd

Reputation: 1892

A few years later in 2024, we can do this with kdump-gdbserver, I did it with a coredump rather than a running system though: https://github.com/ptesarik/kdump-gdbserver

crash-python can apparently also do it, but maybe needs a patched GDB

Upvotes: 0

libvmi

https://github.com/libvmi/libvmi

This project does "LibVMI: Simplified Virtual Machine Introspection" which sounds really close.

This project in particular https://github.com/Wenzel/pyvmidbg uses libvmi and features a demo video of debugging a Windows userland application form inside it, without memory conflicts.

As of May 2019, there are two limitations however as of May 2019, both of which could be overcome with some work: https://github.com/Wenzel/pyvmidbg/issues/24

  • Linux memory parsing is not yet complete
  • requires Xen

The developer of that project also answered further at: https://stackoverflow.com/a/56369454/895245

Implementing it with those libraries would be in my opinion the best way to achieve this goal today.

Linaro lkd-python

First, this Linaro page claims to have a working setup: https://wiki.linaro.org/LandingTeams/ST/GDB that allows you to do usual thread operations such as thread, bt, etc., but it relies on a GDB fork. I will test it out later. In 2016, https://youtu.be/pqn5hIrz3A8 says that the implementation was in C, not as Python scripts unfortunately, which would be better and avoid forking. The sketch for lkd-python can be found at: https://git.linaro.org/people/lee.jones/kieran.bingham/binutils-gdb.git/log/?h=lkd-python

Linux kernel in-tree GDB scripts + my brain

I then tried to see what I could do with the kernel in-tree Python scripts at v4.17 + some manual intervention as a prototype, but didn't quite get there yet.

I have tested using this highly automated QEMU + Buildroot setup.

First follow the procedure I described at: How to debug the Linux kernel with GDB and QEMU? to get GDB working.

Then, as described at: How to debug Linux kernel modules with QEMU? run GDB with:

gdb -ex add-auto-load-safe-path /full/path/to/linux/kernel

This loads the in-tree GDB Python scripts from scripts/gdb.

One of those scripts provides:

lx-ps

which lists all threads with format:

0xffff88000ed08000 1 init
0xffff88000ed08ac0 2 kthreadd

The first field is the address of the task_struct struct, so we can see the entire struct with:

p (struct task_struct)*0xffff88000ed08000 

which should in theory allow us to get any information we want about the process.

Now I wanted to find the PC. For ARM, I've seen: Find program counter of process in kernel and I tried:

task_pt_regs((struct thread_info *)((struct task_struct)*0xffffffc00e8f8000))->uregs[ARM_pc]

but task_pt_regs is a #define and GDB cannot see defines without -ggdb3: How do I print a #defined constant in GDB? which are apparently not set?

The dream: a GDB Thread Aware Python extension API

https://sourceware.org/pipermail/gdb/2017-March/046559.html

Currently the only way to do it with python I think would be to implement a custom version of every command you want to support:

  • xxx-threads: list
  • xxx-thread: change current thread
  • xxx-p: print current thread

But what we really want is to have a GDB api where a python script provides only the minimal necessary parameters for all GDB commands to just work transparently (thread info, thread N and so on).

The API would basically only need to:

  • list threads
  • change to thread
  • provide registers of a given thread

I think, and everything else might could just work based on those. Then the Linux kernel would be able to maintain its own in-tree Python provider that works for the Linux kernel, and similarly for any other operating systems.

Upvotes: 3

Wenzel
Wenzel

Reputation: 317

pyvmidbg developer here.

I will add some clarifications: yes the goal of the project is indeed to have a cross-platform, guest-aware GDB stub.

Most of the implementation is already done for Windows, where we are aware of processes and their threads context. It's possible to intercept a specific process (cmd.exe in the demo) and singlestep its execution (this is limited to 1 process with 1 thread for now), as well as attaching to a new process's entrypoint.

Regarding Linux, I looked at the internals and the resources that I could find, but I'm lacking the whole picture to figure out how I can: - intercept a task when it's being scheduled (core/sched.c:switch_to() ?) - read the task state (Windows's KTRAP_FRAME equivalent for Linux ?)

I asked a question on SO, but nobody answered :/ Linux context switch internals: how does a process goes back to userland after the switch?

If you can help with this, I can guide you through the implementation :)

Regarding the hypervisor support, only Xen is fully supported in the Libvmi interface at the moment. I added a section in the README to describe where we are in terms of VMI APIs with other hypervisors.

Thanks !

Upvotes: 2

Kamath
Kamath

Reputation: 4684

I don't think GDB understands kernel data structures, that would make them version dependent. GDB uses ptrace for gathering information on any running process.

That's all I know :(

Upvotes: 1

Related Questions