Reputation: 2877
So I had a simple ARM assembly (specifically THUMB) program being compiled for a TI Microcontroller. I'm just confused as to where EQU and DCD are stored in memory (RAM vs ROM) and how the AREA directive relates to that. I started off with this:
Y1 EQU 0x23
AREA |.text|, CODE, READONLY, ALIGN=2
THUMB
X2 DCD 0x23
Y2 EQU 0x23
MOV R0, #0
LDR R1, =X2
STR R0, [R1]
END
I assumed that since EQU's are constant, they go in ROM. But here, they are in the CODE section which is READONLY (so I'm assuming that goes in ROM) and in a section that has no AREA directive. I'm not sure what the default is there.
DCD was declared in a READONLY section, yet I'm still allowed to write to it.
If I add a DCD to the empty section I get an error: Area directive missing
. If I add the AREA directive then the code looks like this:
AREA |.data|, DATA
X1 DCD 0x23
Y1 EQU 0x23
AREA |.text|, CODE, READONLY, ALIGN=2
THUMB
EXPORT Start
X2 DCD 0x23
Y2 EQU 0x23
Start
MOV R0, #0
LDR R1, =X1
STR R0, [R1]
MOV R0, #0
LDR R1, =X2
STR R0, [R1]
END
EQUs and DCDs are everywhere and the AREA directives don't seem to affect how I can access them at all. Also, adding READONLY to the AREA DATA directive also has no effect.
Upvotes: 2
Views: 2591
Reputation: 71576
Using an assembler I have access to, the questions you are asking should port between the two assembly languages as a number of the questions are about the instruction set not the assembly language.
.equ X1,0x12345678
.text
.thumb
.globl _start
_start:
ldr r0,=X1
ldr r1,=X2
ldr r2,[r1]
ldr r3,=Y4
ldr r4,=Y3
str r3,[r4]
bl bounce
mov lr,pc
ldr r5,=bounce
bx r5
b .
X2: .word 0xAABBCCDD
.thumb_func
bounce:
bx lr
nop
.data
Y3: .word 0
Y4: .word 0x11223344
assemble link and disassemble.
00001000 <_start>:
1000: 4807 ldr r0, [pc, #28] ; (1020 <bounce+0x4>)
1002: 4908 ldr r1, [pc, #32] ; (1024 <bounce+0x8>)
1004: 680a ldr r2, [r1, #0]
1006: 4b08 ldr r3, [pc, #32] ; (1028 <bounce+0xc>)
1008: 4c08 ldr r4, [pc, #32] ; (102c <bounce+0x10>)
100a: 6023 str r3, [r4, #0]
100c: f000 f806 bl 101c <bounce>
1010: 46fe mov lr, pc
1012: 4d07 ldr r5, [pc, #28] ; (1030 <bounce+0x14>)
1014: 4728 bx r5
1016: e7fe b.n 1016 <_start+0x16>
00001018 <X2>:
1018: aabbccdd bge feef4394 <X1+0xecbaed1c>
0000101c <bounce>:
101c: 4770 bx lr
101e: 46c0 nop ; (mov r8, r8)
1020: 12345678 eorsne r5, r4, #120, 12 ; 0x7800000
1024: 00001018 andeq r1, r0, r8, lsl r0
1028: 00002004 andeq r2, r0, r4
102c: 00002000 andeq r2, r0, r0
1030: 0000101d andeq r1, r0, sp, lsl r0
Disassembly of section .data:
00002000 <__data_start>:
2000: 00000000 andeq r0, r0, r0
00002004 <Y4>:
2004: 11223344 ; <UNDEFINED> instruction: 0x11223344
Disassembly of section .ARM.attributes:
00000000 <.ARM.attributes>:
0: 00001341 andeq r1, r0, r1, asr #6
4: 61656100 cmnvs r5, r0, lsl #2
8: 01006962 tsteq r0, r2, ror #18
c: 00000009 andeq r0, r0, r9
10: 01090206 tsteq r9, r6, lsl #4
so it took the ldr r0,=0x12345678 and turned that into this
1000: 4807 ldr r0, [pc, #28] ; (1020 <bounce+0x4>)
and this
1020: 12345678 eorsne r5, r4, #120, 12 ; 0x7800000
the load (32 bit because it is an ldr, ldrb would be 8 (padded), ldrh 16 bits (padded)) takes the pc which is two instructions ahead, adds 28 to that 0x1000 + 4 + 28 = 0x1000 + 32 = 0x1000 + 0x20 so at that address they placed the data 0x12345678. Same goes for all the other =somethings...
I could have done that myself though and not relied on a pseudo instruction.
.text
.thumb
.globl _start
_start:
ldr r0,xyz
ldr r1,xyz_add
ldr r2,[r1]
b .
xyz: .word 0x12345678
xyz_add: .word xyz
unlinked is good enough
00000000 <_start>:
0: 4801 ldr r0, [pc, #4] ; (8 <xyz>)
2: 4902 ldr r1, [pc, #8] ; (c <xyz_add>)
4: 680a ldr r2, [r1, #0]
6: e7fe b.n 6 <_start+0x6>
00000008 <xyz>:
8: 12345678 eorsne r5, r4, #120, 12 ; 0x7800000
0000000c <xyz_add>:
c: 00000008 andeq r0, r0, r8
because I have it in the same section, nearby I can load the 0x12345678 directly I dont need to get the address then load from the address basically what the =0x12345678 pseudocode does. but for far away things you can still place a data item to be the address then load that then load from that (double indirect).
.text
.thumb
.globl _start
_start:
ldr r0,=0x11223344
ldr r1,=5
at least with one assembler you can use the =something trick for everything and the assembler will hopefully optimize if it fits.
00000000 <_start>:
0: 4800 ldr r0, [pc, #0] ; (4 <_start+0x4>)
2: 2105 movs r1, #5
4: 11223344
would have been nice if it went the other way and if you did a mov immediate it would do the load pc relative if it doesnt fit, but I dont think they do that.
Now translate that to your assembly language. The AREA declaration declares .text and .data the linker later defines where those are. some linkers can modify the code more than just an immediate offset, some can replace the whole instruction at times (to trampoline off some linker inserted code as needed). In this case the linker is going to fill in the addresses to things in the assembler allocated data locations in the sections.
you can have data items in .text as well as .data the .text data items are read only things be it const like tables or addresses to things in other linked in code or sections. things the linker has to fill in the remote addresses to as they are not resolved at assemble time.
EQU is historically the assembly language version of a simple define in C
#define ABCD 0x12345678
and before compiling a pass is done to search and replace instances of ABCD with 0x12345678. Same goes with the assembler. Unlike C you might not be able to do more than just a search and replace, assembler macros are different syntax. but it is define-like.
DCD, DCB, etc are like .word, .byte in gnu assembler, they say I want to put some raw data here or allocate space for raw data here, not instructions but data for whatever reason I want to use it.
One would hope that if the assembler has a READONLY directive that it honors it, if it isnt that would bother me. But at the same time the well used names .text, .data might trump that.
Upvotes: 1