Tracing and disassembling Linux Kernel instructions not matching kernel image file

Question

I'm trying to validate and understand the instructions executed in a simulation framework. The simulation steps are the following:

a binary is cross-compiled in a host x86 machine with gcc (with -fPIC flag)
the binary is then moved and executed in a virtual machine x86 named SimNow (used by AMD for testing their processors)
the SimNow machine produces a list of executed instructions which they are passed to a framework including information about each instruction: virtual address, physical address, size, opcode.
the framework produces a trace of the executed instructions including mnemonic and operands thanks to a x86 disassembler (named distorm). This is an example of trace output:

The list of executed instructions includes the instructions contained into the binary and possible kernel instructions.

I'm validating the user instructions of the trace by using the ouput of the objdump on the binary. They are equal, confirming the correctness of the execution.

This is the objdump output for the instructions in the previous picture:

For kernel instructions, I had to apply preliminary steps:

I installed the kernel headers into the virtual machine and I extracted the linux image for executing the objdump on it.
I added the kernel symbols into the trace output by comparing the virtual address of the kernel instruction with the virtual address contained in /proc/kallsysms.

For the validation step, I'm using the same methodology of the user instructions, comparing the objdump of the linux kernel image with the trace output. However, I noticed some differences...mostly when a kernel symbol instruction is discovered. This is the output of the trace:

This is the correspondent section of the linux kernel image:

As you can see from these pictures, each callq which corresponds to a kernel symbol (comparing the virtual address of the linux image with /proc/kallsyms) is substituted with a NOP DWORD (nopl instruction) in trace output.

What I'm trying to do is understand why there is a NOP DWORD instead of callq for the kernel symbols.

It happens due to the relocation? If yes, how can I reconstruct the relocation for such instructions?

NOTE: I executed objdump with -dr for checking the relocation on the Linux image, but the output didn't change.

My validation methodology is wrong for kernel instructions?

Peter Cordes · Accepted Answer

(Partial answer / guess that might point you in the right direction. Update: Jester suggests this looks like ftrace machinery: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/ftrace.c?h=v5.10)

I suspect those calls are getting NOPed after loading, perhaps when the relevant tracepoint or whatever is not enabled. NOPing out a call for some reason would explain why there's call-target or symbol relocation metadata associated with a NOP.

I think I've read about Linux using code modification for low-overhead tracepoints or something, and self-modifying code is something Linux definitely does in general.

Linux uses self-modifying code that's modified once at boot, or on a very rare config change, to reduce overhead for every execution vs. a branch, for a few different things. (e.g. booting an SMP kernel on a UP machine will NOP out the lock prefix in atomic RMWs that just need to be SMP-safe, not hardware device.) Inline asm macros define symbols and custom sections so the kernel has the necessary metadata. Also something recent about modifying rel32 call targets instead of using indirect branches, to avoid the need for any Spectre mitigation at those sites, but that's not what's happening here.

So in general you should expect to see a few mismatches when you try to verify execution against a kernel image file, and this might be one of them

In this case, this looks like the very top of a function (before setting up a frame pointer), which sounds like a likely place to find some kind of special call, perhaps for tracing (to a special function that preserves all registers).

gcc-generated code would never AFAIK do a call before push %rbp / mov %rsp, %rbp. For one thing, that would violate the 16-byte stack alignment ABI requirement. (Although maybe the kernel uses -mpreferred-stack-boundary=3 instead of 4? After two pushes, there's another more normal call which would also have a misaligned RSP if this was a normal function.) Anyway, that's another sign that there's some custom inline asm hackery or something going on.

Tracing and disassembling Linux Kernel instructions not matching kernel image file

Answers (1)

Related Questions