Reputation: 141
I want to see a function's address in my code, so I write a hello world like this:
#include <stdio.h>
void myfn() {
printf("I am myfn1\n");
printf("I am myfn2\n");
printf("I am myfn3\n");
printf("I am myfn4\n");
printf("I am myfn5\n");
}
typedef void (*MYFN)();
int main() {
MYFN fn = (MYFN)myfn;
printf("addr of fn: 0x%08X\n", (unsigned int)fn);
fn();
printf("just for %s\n", "test");
return 0;
}
And the result is:
# ./test
addr of fn: 0x00008461
I am myfn1
I am myfn2
I am myfn3
I am myfn4
I am myfn5
just for test
So, the address of myfn is 0x00008461?
Then I use objdump to dump it:
84ae: f7ff efb8 blx 8420 <printf@plt>
84b2: f7ff ffd5 bl 8460 <printf@plt+0x40>
84b6: 4807 ldr r0, [pc, #28] ; (84d4 <printf@plt+0xb4>)
84b8: 4907 ldr r1, [pc, #28] ; (84d8 <printf@plt+0xb8>)
84ba: 4478 add r0, pc
84bc: 4479 add r1, pc
84be: f7ff efb0 blx 8420 <printf@plt>
From that, the address of myfn is 0x8460? Near that:
8460: 480a ldr r0, [pc, #40] ; (848c <printf@plt+0x6c>)
8462: b510 push {r4, lr}
8464: 4478 add r0, pc
8466: f7ff efd6 blx 8414 <puts@plt>
846a: 4809 ldr r0, [pc, #36] ; (8490 <printf@plt+0x70>)
846c: 4478 add r0, pc
846e: f7ff efd2 blx 8414 <puts@plt>
8472: 4808 ldr r0, [pc, #32] ; (8494 <printf@plt+0x74>)
8474: 4478 add r0, pc
8476: f7ff efce blx 8414 <puts@plt>
847a: 4807 ldr r0, [pc, #28] ; (8498 <printf@plt+0x78>)
847c: 4478 add r0, pc
847e: f7ff efca blx 8414 <puts@plt>
8482: 4806 ldr r0, [pc, #24] ; (849c <printf@plt+0x7c>)
I wonder the real address is 0x8460, or 0x8461, or 0x8462? Please help me...
Upvotes: 1
Views: 454
Reputation: 71546
This is thumb code. Read the ARM ARM and TRM (Architectural Reference Manual and Technical Reference Manual).
Specifically the BX and BLX instructions. When branching to code that is using thumb instructions (and/or thumb2 extensions), the bx or blx instruction is used, in particular here because the compiler doesnt know at compile time if the printf() function is thumb or arm mode so it has to encode using bx or blx, if it was branching to something being compiled at that time it could use the conditional branches for example. When using bx or blx the lsbit tells the instruction whether it is calling ARM instructions (the lsbit is zero) or thumb instructions (the lsbit is one). In thumb mode the program counter does not keep that lsbit set it is stripped by the bx/blx instruction.
The linker comes through and knows which functions are which mode and will fill in the appropriate addresses. So the function is in memory starting at address 0x8460, but to branch (call) using bx or blx you need to use the address 0x8461 because those are thumb mode instructions.
The compiler doesnt know why you need the address of the function, pretty much every where the linker needs to fill in the address it needs to control that lsbit based on mode, so apparently it is setting it to a one.
The address in question is 0x8460. If you have some reason for needing the real address not the call to address, just strip off the lsbit.
printf("addr of fn: 0x%08X\n", (unsigned int)(fn&(~1)));
Upvotes: 1