Reputation: 11
I am trying to build a binary translator for arm bear metal compiled code and I try to verify proper execution flow by comparing it to that of qemu-arm. I use the following command to dump the program flow:
qemu-arm -d in_asm,cpu -singlestep -D a.flow a.out
I noticed something strange, where the program seems to jump to an irrelevant instruction, since 0x000080b4 is not the branch nor the next instruction following 0x000093ec.
0x000093ec: 1afffff9 bne 0x93d8
R00=00000000 R01=00009c44 R02=00000002 R03=00000000
R04=00000001 R05=0001d028 R06=00000002 R07=00000000
R08=00000000 R09=00000000 R10=0001d024 R11=00000000
R12=f6ffed88 R13=f6ffed88 R14=000093e8 R15=000093ec
PSR=20000010 --C- A usr32
R00=00000000 R01=00009c44 R02=00000002 R03=00000000
R04=00000001 R05=0001d028 R06=00000002 R07=00000000
R08=00000000 R09=00000000 R10=0001d024 R11=00000000
R12=f6ffed88 R13=f6ffed88 R14=000093e8 R15=000093d8
PSR=20000010 --C- A usr32
----------------
IN:
0x000080b4: e59f3060 ldr r3, [pc, #96] ; 0x811c
The instruction that actually executes corresponds to the beggining of the <frame_dummy>
tag in the disassembly. Can someone explain what actually happens within the emulator and is this behavior normal in the ARM architecture? The program was compiled with: arm-none-eabi-gcc --specs=rdimon.specs a.c
Here is the same segment of the program flow without the CPU state:
0x0000804c: e59f3018 ldr r3, [pc, #24] ; 0x806c
0x00008050: e3530000 cmp r3, #0 ; 0x0
0x00008054: 01a0f00e moveq pc, lr
----------------
IN: __libc_init_array
0x000093e8: e1560004 cmp r6, r4
0x000093ec: 1afffff9 bne 0x93d8
----------------
IN:
0x000080b4: e59f3060 ldr r3, [pc, #96] ; 0x811c
0x000080b8: e3530000 cmp r3, #0 ; 0x0
0x000080bc: 0a000009 beq 0x80e8
This is the disassembly of this part:
93d4: 0a000005 beq 93f0 <__libc_init_array+0x68>
93d8: e2844001 add r4, r4, #1
93dc: e4953004 ldr r3, [r5], #4
93e0: e1a0e00f mov lr, pc
93e4: e1a0f003 mov pc, r3
93e8: e1560004 cmp r6, r4
93ec: 1afffff9 bne 93d8 <__libc_init_array+0x50>
93f0: e8bd4070 pop {r4, r5, r6, lr}
Upvotes: 1
Views: 202
Reputation: 58762
It is a reverse jump to previously emitted TB, you don't even have to read that much back:
IN: __libc_init_array
0x000093d8: e2844001 add r4, r4, #1 ; 0x1
0x000093dc: e4953004 ldr r3, [r5], #4
0x000093e0: e1a0e00f mov lr, pc
0x000093e4: e1a0f003 mov pc, r3
----------------
IN: register_fini
0x0000804c: e59f3018 ldr r3, [pc, #24] ; 0x806c
0x00008050: e3530000 cmp r3, #0 ; 0x0
0x00008054: 01a0f00e moveq pc, lr
----------------
IN: __libc_init_array
0x000093e8: e1560004 cmp r6, r4
0x000093ec: 1afffff9 bne 0x93d8
So, qemu is not showing it again. Notice this loop is iterating function pointers, the first one points to register_fini
and the second one to the magical 0x000080b4
address in question (no symbol for it). When this unnamed function conditionally returns with moveq pc, lr
control is transferred back to __libc_init_array
address 0x000093e8
which then determines that the array end has been reached and again just returns to its caller at 0x000093f0
.
Upvotes: 5