Reputation: 47563

How to make good use of stack trace (from kernel or core dump)?

If you are lucky when your kernel module crashes, you would get an oops with a log with a lot of information, such as values in the registers etc. One such information is the stack trace (The same is true for core dumps, but I had originally asked this for kernel modules). Take this example:

[<f97ade02>] ? skink_free_devices+0x32/0xb0 [skin_kernel]
[<f97aba45>] ? cleanup_module+0x1e5/0x550 [skin_kernel]
[<c017d0e7>] ? __stop_machine+0x57/0x70
[<c016dec0>] ? __try_stop_module+0x0/0x30
[<c016f069>] ? sys_delete_module+0x149/0x210
[<c0102f24>] ? sysenter_do_call+0x12/0x16

My guess is that the +<number1>/<number2> has something to do with the offset from function in which the error has occurred. That is, by inspecting this number, perhaps looking at the assembly output I should be able to find out the line (better yet, instruction) in which this error has occurred. Is that correct?

My question is, what are these two numbers exactly? How do you make use of them?

Upvotes: 18

Answers (3)

philn

Reputation: 784

regurgitating this answer you need to use faddr2line

In my case I had the following truncated call trace:

[  246.790938][   T35] Call trace:
[  246.794075][   T35]  __switch_to+0x10c/0x180
[  246.798348][   T35]  __schedule+0x278/0x6e0
[  246.802531][   T35]  schedule+0x44/0xd0
[  246.806368][   T35]  rpm_resume+0xf4/0x628
[  246.810463][   T35]  __pm_runtime_resume+0x94/0xc0
[  246.815257][   T35]  macb_open+0x30/0x2b8
[  246.819265][   T35]  __dev_open+0x10c/0x188

and ran the following in the mainline linux kernel:

./scripts/faddr2line vmlinux macb_open+0x30/0x2b8

giving the output

macb_open+0x30/0x2b8:
pm_runtime_get_sync at include/linux/pm_runtime.h:386
(inlined by) macb_open at drivers/net/ethernet/cadence/macb_main.c:2726

Upvotes: 0

mgalgs

Reputation: 16789

For Emacs users, here's is a major mode to easily jump around within the stack trace (uses addr2line internally).

Disclaimer: I wrote it :)

Upvotes: 1

Pavan Manjunath

Reputation: 28545

skink_free_devices+0x32/0xb0

This means the offending instruction is 0x32 bytes from the start of the function skink_free_devices() which is 0xB0 bytes long in total.

If you compile your kernel with -g enabled, then you can get the line number inside functions where the control jumped using the tool addr2line or our good old gdb

Something like this

$ addr2line -e ./vmlinux 0xc01cf0d1
/mnt/linux-2.5.26/include/asm/bitops.h:244
or
$ gdb ./vmlinux
...
(gdb) l *0xc01cf0d1
0xc01cf0d1 is in read_chan (include/asm/bitops.h:244).
(...)
244     return ((1UL << (nr & 31)) & (((const volatile unsigned int *) addr)[nr >> 5])) != 0;
(...)

So just give the address you want to inspect to addr2line or gdb and they shall tell you the line number in the source file where the offending function is present See this article for full details

EDIT: vmlinux is the uncompressed version of the kernel used for debugging and is generally found @ /lib/modules/$(uname -r)/build/vmlinux provided you have built your kernel from sources. vmlinuz that you find at /boot is the compressed kernel and may not be that useful in debugging

Upvotes: 19

How to make good use of stack trace (from kernel or core dump)?

Answers (3)

Related Questions