Reputation: 31
I have parsed out the addresses, file names and line numbers from a dSYM file for an iOS app. I basically have a table that maps an address to a file name and line number, which is very helpful for debugging.
To get the actual lookup address
, I use the stack trace address from the crash report and use the formula specified in this answer: https://stackoverflow.com/a/13576028/2758234. So something like this.
(actual lookup address)
= (stack trace address) + (virtual memory slide) - (image load address)
I use that address and look it up on my table. The file name I get is correct, but the line number always points to the end of the function or method that was called, not the actual line that called the following function on the stack trace.
I read somewhere, can't remember where, that frame addresses have to be de-tagged, because they are aligned to double the system pointer size. So for 32-bit systems, the pointer size is 4 bytes, so we de-tag using 8-bytes, using a formula like this:
(de-tagged address) = (tagged address) & ~(sizeof(uintptr_t)*2 - 1)
where uintptr_t
is the data type used for pointers in Objective-C.
After doing this, the lookup sort of works, but I have to do something like find the closest address that is less than or equal to the de-tagged address.
Question #1:
Why do I have to de-tag a stack frame address? Why in the stack trace aren't the addresses already pointing to the right place?
Question #2:
Sometimes in the crash report there seems to be a missing frame. For example, if function1()
calls function2()
which calls function3()
which calls function4()
, in my stack trace I will see something like:
0 Exception
1 function4()
2 function3()
4 function1()
And the stack trace address for function3()
(frame 2, above) doesn't even point to the right line number (but it is the right file, though), even after de-tagging. I see this even when I let Xcode symbolicate a crash report.
Why does this happen?
Upvotes: 3
Views: 3640
Reputation: 15425
For question #1, the addresses in an iOS crash report have three components that are taken into account: The original load address of your app, the random slide value that was added to that address when your app was launched, and the offset within the binary. At the end of the crash report, there should be a line showing the actual load address of your binary.
To compute the slide, you need to take the actual load address from the crash report and subtract the original load address. This tells you the random slide value that was applied to this particular launch of the app.
I'm not sure how you derived your table - the problem may lie there. You may want to double check by using lldb. You can load your app into lldb and tell lldb that it should be loaded at address 0x140000 (this would be the actual load address from your crash report, don't worry about slides and original load addresses)
% xcrun lldb
(lldb) target create -d -a armv7 /path/to/myapp.app
(lldb) target modules load -f myapp __TEXT 0x140000
Now lldb has your binary loaded at the actual load address of this crash report. You can do all the usual queries in lldb, such as
(lldb) image lookup -v -a 0x144100
to do a verbose lookup on address 0x144100 (which might appear in your crash report).
You can also do a nifty "dump your internal line table" command in lldb with target modules dump line-table
. For instance, I compiled a hello-world Mac app:
(lldb) tar mod dump line-table a.c
Line table for /tmp/a.c in `a.out
0x0000000100000f20: /tmp/a.c:3
0x0000000100000f2f: /tmp/a.c:4:5
0x0000000100000f39: /tmp/a.c:5:1
0x0000000100000f44: /tmp/a.c:5:1
(lldb)
I can change the load address of my binary and try dumping the line table again:
(lldb) tar mod load -f a.out __TEXT 0x200000000
section '__TEXT' loaded at 0x200000000
(lldb) tar mod dump line-table a.c
Line table for /tmp/a.c in `a.out
0x0000000200000f20: /tmp/a.c:3
0x0000000200000f2f: /tmp/a.c:4:5
0x0000000200000f39: /tmp/a.c:5:1
0x0000000200000f44: /tmp/a.c:5:1
(lldb)
I'm not sure I understand what you're doing with the de-tagging of the addresses. The addresses on the call stack are the return addresses of these functions, not the call instruction - so these may point to the line following the actual method invocation / dispatch source line, but that's usually easy to understand when you're looking at the source code. If all of your lookups are pointing to the end of the methods, I think your lookup scheme may have a problem.
As for question #2, the unwind of frame #1 can be a little tricky at times if frame #0 (the currently executing frame) is a leaf function that doesn't set up a stack frame, or is in the process of setting up a stack frame. In those cases, frame #1 can get skipped. But once you're past frame #1, especially on arm, the unwind should not miss any frames.
There is one very edge-casey wrinkle when a function marked noreturn
calls another function, the last instruction of the function may be a call -- with no function epilogue -- because it knows it will never get control again. Pretty uncommon. But in that case, a simple-minded symbolication will give you a pointer to the first instruction of the next function in memory. Debuggers et al use a trick where they subtract 1 from the return address when doing symbol / source line lookup to sidestep this issue, but it's not something casual symbolicators usually need worry about. And you have to be careful to not do the decr-pc trick on the currently-executing function (frame 0) because a function may have just started executing and you don't want to back up the pc into the previous function and symbolicate incorrectly.
Upvotes: 6