Reputation: 19100
I'm trying to debug a multithreaded program, which somehow ends up with RIP=0x0
and lots of zeros on the stack. Is there any way to find out where the program was just one instruction before? When I try single-stepping, the result appears different (likely some race condition), but if I just start the program and let it go, it consistently lands here.
So is there any way to trap on a jump/call to zero address before it is actually taken, without doing single-stepping or emulation? Is there maybe some register holding address of previous instruction?
Upvotes: 1
Views: 2428
Reputation: 213375
Is there maybe some register holding address of previous instruction?
There is no such register, but there is Branch Trace Store, and GDB supports it with record btrace
command.
Note: from above wikipedia article:
Branch tracing on Intel processors can cause 40x application run-time slow down.
Here is how you could use record btrace
to debug your problem:
cat t.c
#include <string.h>
int bar()
{
char buf[10];
memset(buf, 0, sizeof(buf));
memset(buf, 'A', 100); // overflow
}
int foo()
{
return bar();
}
int main()
{
return foo();
}
gcc -g t.c -fno-stack-protector
gdb -q ./a.out
(gdb) run
Starting program: /tmp/a.out
Program received signal SIGSEGV, Segmentation fault.
0x0000000000400562 in bar () at t.c:7
7 }
(gdb) bt 5
#0 0x0000000000400562 in bar () at t.c:7
#1 0x4141414141414141 in ?? ()
#2 0x4141414141414141 in ?? ()
#3 0x4141414141414141 in ?? ()
#4 0x4141414141414141 in ?? ()
(More stack frames follow...)
Hard to debug: we have no idea what happened here (this, I think, models your current problem).
(gdb) start
Temporary breakpoint 1 at 0x400577: file t.c, line 16.
Starting program: /tmp/a.out
Temporary breakpoint 1, main () at t.c:16
16 return foo();
(gdb) record btrace
(gdb) c
Continuing.
Program received signal SIGSEGV, Segmentation fault.
0x0000000000400562 in bar () at t.c:7
7 }
(gdb) record instruction-history
719 0x00007ffff7a9e531 <memset+113>: movdqu %xmm8,0x20(%rdi)
720 0x00007ffff7a9e537 <memset+119>: movdqu %xmm8,-0x30(%rdi,%rdx,1)
721 0x00007ffff7a9e53e <memset+126>: movdqu %xmm8,0x30(%rdi)
722 0x00007ffff7a9e544 <memset+132>: movdqu %xmm8,-0x40(%rdi,%rdx,1)
723 0x00007ffff7a9e54b <memset+139>: add %rdi,%rdx
724 0x00007ffff7a9e54e <memset+142>: and $0xffffffffffffffc0,%rdx
725 0x00007ffff7a9e552 <memset+146>: cmp %rdx,%rcx
726 0x00007ffff7a9e555 <memset+149>: je 0x7ffff7a9e4fa <memset+58>
727 0x00007ffff7a9e4fa <memset+58>: repz retq
728 0x0000000000400561 <bar+52>: leaveq
Above instruction trace tells us that we crashed on return from bar
, and that memset
was executing just before the return.
(gdb) record instruction-history -
709 0x00007ffff7a9e4cd <memset+13>: punpcklwd %xmm8,%xmm8
710 0x00007ffff7a9e4d2 <memset+18>: pshufd $0x0,%xmm8,%xmm8
711 0x00007ffff7a9e4d8 <memset+24>: cmp $0x40,%rdx
712 0x00007ffff7a9e4dc <memset+28>: ja 0x7ffff7a9e510 <memset+80>
713 0x00007ffff7a9e510 <memset+80>: lea 0x40(%rdi),%rcx
714 0x00007ffff7a9e514 <memset+84>: movdqu %xmm8,(%rdi)
715 0x00007ffff7a9e519 <memset+89>: and $0xffffffffffffffc0,%rcx
716 0x00007ffff7a9e51d <memset+93>: movdqu %xmm8,-0x10(%rdi,%rdx,1)
717 0x00007ffff7a9e524 <memset+100>: movdqu %xmm8,0x10(%rdi)
718 0x00007ffff7a9e52a <memset+106>: movdqu %xmm8,-0x20(%rdi,%rdx,1)
(gdb)
699 0x00007ffff7a9e5b6 <memset+246>: retq
700 0x000000000040054b <bar+30>: lea -0x10(%rbp),%rax
701 0x000000000040054f <bar+34>: mov $0x64,%edx
702 0x0000000000400554 <bar+39>: mov $0x41,%esi
703 0x0000000000400559 <bar+44>: mov %rax,%rdi
704 0x000000000040055c <bar+47>: callq 0x400410 <memset@plt>
... And this is where the memset
was called from.
705 0x0000000000400410 <memset@plt+0>: jmpq *0x200c02(%rip) # 0x601018 <[email protected]>
706 0x00007ffff7a9e4c0 <memset+0>: movd %esi,%xmm8
707 0x00007ffff7a9e4c5 <memset+5>: mov %rdi,%rax
708 0x00007ffff7a9e4c8 <memset+8>: punpcklbw %xmm8,%xmm8
Upvotes: 1
Reputation: 213375
So is there any way to trap on a jump/call to zero address before it is actually taken, without doing single-stepping or emulation?
No.
Is there maybe some register holding address of previous instruction?
Not on x86 (there is such a register on HPPA).
Since from your followup comments it appears that you have a stack overflow that wipes the return address and eventually causes you to return to 0, note that:
Since you suspect a race condition, note that thread sanitizer is even better for finding these.
Upvotes: 1