Reputation: 11
I got a core dump in my application and when I try to analyze it seems that it has corrupt stack. What causes the problem?
Program terminated with signal 11, Segmentation fault.
#0 0x40173f54 in nanosleep () from /lib/libc.so.6
(gdb) bt
#0 0x40173f54 in nanosleep () from /lib/libc.so.6
#1 0x401b2a1c in __libc_enable_asynccancel () from /lib/libc.so.6
#2 0x0000cdb8 in ?? ()
Cannot access memory at address 0x12
(gdb) info frame
Stack level 0, frame at 0xbeaedbc0:
pc = 0x40173f54 in nanosleep; saved pc 0x401b2a1c
called by frame at 0xbeaedbd8
Arglist at 0xbeaedbc0, args:
Locals at 0xbeaedbc0, Previous frame's sp is 0xbeaedbc0
(gdb) info frame 1
Stack frame at 0xbeaedbd8:
pc = 0x401b2a1c in __libc_enable_asynccancel; saved pc 0xcdb8
called by frame at Cannot access memory at address 0x12
(gdb) info frame 2
Stack frame at Cannot access memory at address 0x12
Upvotes: 1
Views: 6926
Reputation: 605
This stack may or may not be corrupt, this can also happen with -fomit-frame-pointer
.
For what it's worth, here is my current strategy for this. I don't claim this is an optimal strategy, just the one that works for me at the moment:
Get symbols. The more information you have about the code, the less pain you have to go through re-creating that information yourself.
I re-construct the stack manually. To do this, I usually start feeding pointer-aligned values found in the stack to 'info symbol' to see if I can get any useful information. Lacking symbols, it can also be useful to decode the 'instructions' found at memory locations possibly pointed to by values that, if taken as a pointer, would be near known code locations. This can yield calls to locations that do have symbols.
When my stack grows down (as is the case here), I find it can be useful to see which function candidates called the last valid-looking function.
I try to reproduce the problem. If I can get things to fail live, everything is orders of magnitude easier.
I then look at the stack and try and determine the offset at which the corruption started.
I go through the assembly for the function candidates to get hints about which data structures were present at which offsets.
Finally it's possible something randomly hit a piece of memory (for example, another thread blew way past its own stack, missed the possible guard pages and hit your stack.) If you don't have any clues yet, it becomes time to scan memory for pointers to the corrupted part of the stack and then reverse-engineering the data structures you find.
Upvotes: 4