rcplusplus
rcplusplus

Reputation: 2877

ROM vs RAM in ARM assembly and the AREA directive

So I had a simple ARM assembly (specifically THUMB) program being compiled for a TI Microcontroller. I'm just confused as to where EQU and DCD are stored in memory (RAM vs ROM) and how the AREA directive relates to that. I started off with this:

Y1      EQU     0x23


        AREA    |.text|, CODE, READONLY, ALIGN=2
        THUMB

X2      DCD     0x23
Y2      EQU     0x23

    MOV R0, #0
    LDR R1, =X2
    STR R0, [R1]

        END

I assumed that since EQU's are constant, they go in ROM. But here, they are in the CODE section which is READONLY (so I'm assuming that goes in ROM) and in a section that has no AREA directive. I'm not sure what the default is there.

DCD was declared in a READONLY section, yet I'm still allowed to write to it.

If I add a DCD to the empty section I get an error: Area directive missing. If I add the AREA directive then the code looks like this:

    AREA    |.data|, DATA

X1      DCD     0x23
Y1      EQU     0x23


        AREA    |.text|, CODE, READONLY, ALIGN=2
        THUMB
        EXPORT  Start

X2      DCD     0x23
Y2      EQU     0x23

Start
    MOV R0, #0
    LDR R1, =X1
    STR R0, [R1]
    MOV R0, #0
    LDR R1, =X2
    STR R0, [R1]

        END

EQUs and DCDs are everywhere and the AREA directives don't seem to affect how I can access them at all. Also, adding READONLY to the AREA DATA directive also has no effect.

Upvotes: 2

Views: 2591

Answers (1)

old_timer
old_timer

Reputation: 71576

Using an assembler I have access to, the questions you are asking should port between the two assembly languages as a number of the questions are about the instruction set not the assembly language.

.equ X1,0x12345678

.text
.thumb

.globl _start
_start:


ldr r0,=X1
ldr r1,=X2
ldr r2,[r1]
ldr r3,=Y4
ldr r4,=Y3
str r3,[r4]
bl bounce
mov lr,pc
ldr r5,=bounce
bx r5
b .
X2: .word 0xAABBCCDD

.thumb_func
bounce:
    bx lr
    nop

.data

Y3: .word 0
Y4: .word 0x11223344

assemble link and disassemble.

00001000 <_start>:
    1000:   4807        ldr r0, [pc, #28]   ; (1020 <bounce+0x4>)
    1002:   4908        ldr r1, [pc, #32]   ; (1024 <bounce+0x8>)
    1004:   680a        ldr r2, [r1, #0]
    1006:   4b08        ldr r3, [pc, #32]   ; (1028 <bounce+0xc>)
    1008:   4c08        ldr r4, [pc, #32]   ; (102c <bounce+0x10>)
    100a:   6023        str r3, [r4, #0]
    100c:   f000 f806   bl  101c <bounce>
    1010:   46fe        mov lr, pc
    1012:   4d07        ldr r5, [pc, #28]   ; (1030 <bounce+0x14>)
    1014:   4728        bx  r5
    1016:   e7fe        b.n 1016 <_start+0x16>

00001018 <X2>:
    1018:   aabbccdd    bge feef4394 <X1+0xecbaed1c>

0000101c <bounce>:
    101c:   4770        bx  lr
    101e:   46c0        nop         ; (mov r8, r8)
    1020:   12345678    eorsne  r5, r4, #120, 12    ; 0x7800000
    1024:   00001018    andeq   r1, r0, r8, lsl r0
    1028:   00002004    andeq   r2, r0, r4
    102c:   00002000    andeq   r2, r0, r0
    1030:   0000101d    andeq   r1, r0, sp, lsl r0

Disassembly of section .data:

00002000 <__data_start>:
    2000:   00000000    andeq   r0, r0, r0

00002004 <Y4>:
    2004:   11223344            ; <UNDEFINED> instruction: 0x11223344

Disassembly of section .ARM.attributes:

00000000 <.ARM.attributes>:
   0:   00001341    andeq   r1, r0, r1, asr #6
   4:   61656100    cmnvs   r5, r0, lsl #2
   8:   01006962    tsteq   r0, r2, ror #18
   c:   00000009    andeq   r0, r0, r9
  10:   01090206    tsteq   r9, r6, lsl #4

so it took the ldr r0,=0x12345678 and turned that into this

 1000:  4807        ldr r0, [pc, #28]   ; (1020 <bounce+0x4>)

and this

    1020:   12345678    eorsne  r5, r4, #120, 12    ; 0x7800000

the load (32 bit because it is an ldr, ldrb would be 8 (padded), ldrh 16 bits (padded)) takes the pc which is two instructions ahead, adds 28 to that 0x1000 + 4 + 28 = 0x1000 + 32 = 0x1000 + 0x20 so at that address they placed the data 0x12345678. Same goes for all the other =somethings...

I could have done that myself though and not relied on a pseudo instruction.

.text
.thumb

.globl _start
_start:

    ldr r0,xyz
    ldr r1,xyz_add
    ldr r2,[r1]
    b .

xyz: .word 0x12345678
xyz_add: .word xyz

unlinked is good enough

00000000 <_start>:
   0:   4801        ldr r0, [pc, #4]    ; (8 <xyz>)
   2:   4902        ldr r1, [pc, #8]    ; (c <xyz_add>)
   4:   680a        ldr r2, [r1, #0]
   6:   e7fe        b.n 6 <_start+0x6>

00000008 <xyz>:
   8:   12345678    eorsne  r5, r4, #120, 12    ; 0x7800000

0000000c <xyz_add>:
   c:   00000008    andeq   r0, r0, r8

because I have it in the same section, nearby I can load the 0x12345678 directly I dont need to get the address then load from the address basically what the =0x12345678 pseudocode does. but for far away things you can still place a data item to be the address then load that then load from that (double indirect).

.text
.thumb

.globl _start
_start:

    ldr r0,=0x11223344
    ldr r1,=5

at least with one assembler you can use the =something trick for everything and the assembler will hopefully optimize if it fits.

00000000 <_start>:
   0:   4800        ldr r0, [pc, #0]    ; (4 <_start+0x4>)
   2:   2105        movs    r1, #5
   4:   11223344

would have been nice if it went the other way and if you did a mov immediate it would do the load pc relative if it doesnt fit, but I dont think they do that.

Now translate that to your assembly language. The AREA declaration declares .text and .data the linker later defines where those are. some linkers can modify the code more than just an immediate offset, some can replace the whole instruction at times (to trampoline off some linker inserted code as needed). In this case the linker is going to fill in the addresses to things in the assembler allocated data locations in the sections.

you can have data items in .text as well as .data the .text data items are read only things be it const like tables or addresses to things in other linked in code or sections. things the linker has to fill in the remote addresses to as they are not resolved at assemble time.

EQU is historically the assembly language version of a simple define in C

#define ABCD 0x12345678

and before compiling a pass is done to search and replace instances of ABCD with 0x12345678. Same goes with the assembler. Unlike C you might not be able to do more than just a search and replace, assembler macros are different syntax. but it is define-like.

DCD, DCB, etc are like .word, .byte in gnu assembler, they say I want to put some raw data here or allocate space for raw data here, not instructions but data for whatever reason I want to use it.

One would hope that if the assembler has a READONLY directive that it honors it, if it isnt that would bother me. But at the same time the well used names .text, .data might trump that.

Upvotes: 1

Related Questions