Reputation: 2646
I'm importing a stack-tracing C code (found somewhere on Stack Overflow) in my code to trace where memory blocks have been allocated:
struct layout
{
struct layout *ebp;
void *ret;
};
struct layout *fr;
__asm__("movl %%ebp, %[fp]" : /* output */ [fp] "=r" (fr));
for (int i=1 ; i<8 && (unsigned char*) fr > dsRAM; i++) {
x[i] = (size_t) fr->ret;
fr = fr->ebp;
}
Things work fairly well, except that in some calls, the code is missing some functions near the top of the stack, e.g. GDB will report:
While the code fills x[]
with the addresses of malloc, new operator and main(), missing TestBasicScript.
The code got compiled by g++ 4.5.1 (old devkit for homebrew console programming) with the following flags:
CFLAGS += -I libgeds/source/ -I wrappers -I $(DEVKITPRO)/include -DARM9 \
-include wrappers/nds/system.h -include wrappers/fake.h
CFLAGS += -m32 -Duint=uint32_t -g -Wall -Weffc++ -fno-omit-frame-pointer
I tried to use __builtin_return_address()
instead, but I get pretty much the same result with much longer code.
EDIT: I noted that I'm systematically missing the caller of operator new
, which could be explained if the code of _Znwj don't setup a stack frame. So the list of questions become :
How does GDB manage to find that TestBasicScript() function call if it's not in the stack frames list ?
How do I configure linking steps so that debug-friendly variant of libstdc++ (if any) is used ?
Original sub-question "Is there compile-time options that guarantee I can trace 100% of the calls to my malloc clone ?" is thus answered by @chqrlie: -O0
is all I should need. But it will be effective only if applied on all my binaries, shared libraries included.
Upvotes: 3
Views: 2519
Reputation: 6066
There are many reasons why some frames might be omitted, like for example inlining and optimization (although the provided CFLAGS do not contain optimization flags and the default is AFAIK no optimization).
Anyway, for GCC there is builtin support of stack walking, by using backtrace()
, backtrace_symbols()
and perhaps combined with abi::__cxa_demangle()
, you can try those as well.
Other option is to use libunwind, I was trying it as well with quite good results (and in its source code you can see some useful techniques for in-app stack walking).
All the above usually don't work very well with optimized (release) executables, in particular if they do not contain the debug info (although it might have been generated and stored aside) the printed stack will be useless (besides skipped frames because of the optimization).
An ultimate technique which works even for optimized code is generating a core dump. There you have all the information about the stack (the binary itself does not need to contain the debuginfo, it just can be left aside and only used for examining the core offline), and as a bonus values of all variables on the stack, information about all threads currently running etc. For tracing memory allocations it is probably an overkill (it is also quite slow), but sometimes it can be pretty useful. In one of my projects I created a working implementation of such core dumper which is still present in the production code.
Note that you can actually generate a core dump of the app without terminating the application - the implementation I created basically works as follows:
fork()
the process at the point where the core dump should be generatedabort()
to generate the core dump (the call stack of the forked process is the same as the original process), i.e. only the forked process is terminated by the abort()
waitpid()
to wait until the child process generates the core dump and terminates (with a guard counter to not wait forever)This turned out to work pretty well in some situations where a diagnostic stack trace was required for release production application.
EDIT: Another option which I also tried is using ptrace()
(if I remember well, that is also one of the techniques used by the libunwind mentioned above and actually also by GDB). That works the similar way - spawning a child process by fork()
and then calling ptrace(PTRACE_TRACEME)
in there; the parent process can then issue various ptrace()
calls to examine the stack of the child (which happens to be the same as the stack of the parent at the point of fork()
). I think the libunwind source code contain its use so you can examine it there.
Upvotes: 3
Reputation: 144685
The compiler may not always generate a stack frame with %ebp
pointing the the previous frame. For some functions, it may generate code that uses %esp
based addressing to retrieve the arguments, for others it may generate tail recursion with a jump instead of a call/ret sequence. The stack trace as you are trying to scan it may be incomplete.
Try compiling the whole project with optimisation disabled (-O0
).
Upvotes: 2