Reputation: 61
I'm trying to learn ARM Assembly and there are a couple of lines in my computer-generated .s file that are confusing me, primarily, this block:
.L6:
.word .LC0-(.LPIC8+4)
.word .LC1-(.LPIC10+4
.word .LC0-(.LPIC11+4)
and how it relates to this block:
.LCFI0:
ldr r3, .L6
.LPIC8:
add r3, pc
My best guess tells me that this is loading the memory address of (the beginning of) my ascii string into r3, but I'm confused how this is happening exactly. .LC0-(.LPIC8+4) is the difference between where the add r3, pc is called and where the string is located. Adding pc to that difference should end back up at the string, but why not just directly call
ldr r3, .LC0
instead of having these .word things and this awkward and ldr/add pair? Is this the only or best way for the compiler to handle this, or is it just the result of some generic algorithm the compiler uses to produce code like this?
Also, what is
@ sp needed for prologue
It sounds like a reminder for the compiler to add the stack pointer handling to the prologue. But I feel like that should have already happened and that's not where the prologue is..
Below is most of the assembly code with my hopefully correct comments (there's some debugging stuff at the end, but it's too long to include.
Any help that anyone could provide would be much appreciated!
.arch armv5te
.fpu softvfp
.eabi_attribute 20, 1
.eabi_attribute 21, 1
.eabi_attribute 23, 3
.eabi_attribute 24, 1
.eabi_attribute 25, 1
.eabi_attribute 26, 2
.eabi_attribute 30, 2
.eabi_attribute 18, 4
.code 16
.file "helloneon.c"
.section .debug_abbrev,"",%progbits
.Ldebug_abbrev0:
.section .debug_info,"",%progbits
.Ldebug_info0:
.section .debug_line,"",%progbits
.Ldebug_line0:
.text
.Ltext0:
.section .text.main,"ax",%progbits
.align 2
.global main
.code 16
.thumb_func
.type main, %function
main:
.fnstart
.LFB4:
.file 1 "jni/helloneon.c"
.loc 1 4 0
.save {r4, lr}
push {r4, lr}
.LCFI0:
.loc 1 4 0
ldr r3, .L6 ; r3 = char* hello, first position
.LPIC8:
add r3, pc ; pc = program counter, r3 += pc?
.loc 1 3 0
mov r1, r3 ; r1 = r3
add r1, r1, #127 ; r1 += 127
.L2:
.loc 1 10 0 ; r2 = holding an item in char* hello. r3 = pointer to location in hello
ldrb r2, [r3] ; r2 = r3 load next char
sub r2, r2, #32 ; r2 -=32 subtract 32 to char in register
strb r2, [r3] ; r3 = r2 put uppercase char
add r3, r3, #1 ; r3 += 1
.loc 1 8 0
cmp r3, r1 ; compare r3, r1
bne .L2 ; if not equal, goto L2
.loc 1 13 0
ldr r0, .L6+4 ; r0 =
ldr r1, .L6+8 ; r1 =
.loc 1 16 0
@ sp needed for prologue
.loc 1 13 0
.LPIC10:
add r0, pc ; r0 += pc
.LPIC11:
add r1, pc ; r1 += pc
bl printf ; goto printf
.loc 1 16 0
mov r0, #0 ; r0 = 0
pop {r4, pc} ; epilogue
.L7:
.align 2
.L6:
.word .LC0-(.LPIC8+4) ;
.word .LC1-(.LPIC10+4) ;
.word .LC0-(.LPIC11+4) ;
.LFE4:
.fnend
.size main, .-main
.section .rodata.str1.4,"aMS",%progbits,1
.align 2
.LC0:
.ascii "helloworldthisismytestprogramtoconvertlowcharstobig"
.ascii "charsiamtestingneonandineedaninputofonehundredandtw"
.ascii "entyeightcharactersinleng\000"
.LC1:
.ascii "%s\000"
.section .debug_frame,"",%progbits
.Lframe0:
.4byte .LECIE0-.LSCIE0
.LSCIE0:
.4byte 0xffffffff
.byte 0x1
.ascii "\000"
.uleb128 0x1
.sleb128 -4
.byte 0xe
.byte 0xc
.uleb128 0xd
.uleb128 0x0
.align 2
and here's the c code:
#include <stdio.h>
int main()
{
char* hello = "helloworldthisismytestprogramtoconvertlowcharstobigcharsiamtestingneonandineedaninputofonehundredandtwentyeightcharactersinleng"; // len = 127 + \0
int i, size = 127;
for (i = 0; i < size; i++)
{
hello[i] -= 32;
}
printf("%s", hello);
return 0;
}
Upvotes: 4
Views: 4941
Reputation: 58467
ldr r3, .L6
is a pseudo-instruction. What it actually translates to is something like ldr r3,[pc, #offset]
, where offset
is the distance in memory between the LDR instruction and the location of the value it's trying to load.
The fixed-width instructions of ARM processors means that you only have so many bits to spend on offsets in LDR/STR instructions, which in turn means that values loaded through PC-relative loads must be stored fairly close to the corresponding load instructions.
.LC0
is in a completely different section than .LPIC8
, and therefore most likely is out of range for a PC-relative load.
Some ARM assemblers provide an .LTORG directive that can be used to scatter "pools" of literals across the same section as the code. For example, this:
LDR r3,=.LC0 ; note the '='
....
.LTORG
Would in such an assembler translate into something like this:
LDR r3,[pc,#offset_to_LC0Value]
....
LC0Value: .word .LC0
In the assembly code shown in the question it's not just the loads that are PC-relative; the values that are being loaded are also PC-relative. The reason for that is to get Position Independent Code. If absolute addresses were being loaded and then used, the code would fail unless it was executed from a specific virtual address. Accessing all relevant data through addresses relative to the current PC allows you to break that dependency.
Upvotes: 3