Aaron Beaudoin
Aaron Beaudoin

Reputation: 1147

How do I leave memory uninitialized in GNU ARM assembly?

I'm using GCC on my Raspberry Pi to compile some assembly code for a course I'm taking. It is my understanding from information in the GNU Assembler Reference that I can reproduce the following C code in GNU ARM Assembly:

int num = 0;

By writing this:

        .data
num:    .word 0

Great! Now how would I write this?

int num;

It is my understanding that leaving a variable uninitialized like this means I should treat it as containing whatever garbage value was in the memory location before. Therefore, I shouldn't use it before I've given it a value in some way.

But suppose for some reason I intended to store a huge amount of data in memory and needed to reserve a massive amount of space for it. It seems to me like it would be a massive waste of resources to initialize the entire area of memory to some value if I'm about to fill it with some data anyways. Yet from what I can find there seems to be no way to make a label in GCC ARM Assembly without initializing it to some value. According to my assembly textbook the .word directive can have zero expressions after it, but if used this way "then the address counter is not advanced and no bytes are reserved." My first though was to use the ".space" or ".skip" directives instead, but for this directive even the official documentation says that "if the comma and fill are omitted, fill is assumed to be zero."

Is there no way for me to reserve a chunk of memory without initializing it in GCC ARM Assembly?

Upvotes: 2

Views: 2665

Answers (2)

old_timer
old_timer

Reputation: 71526

What happened when you tried it?

When I tried it:

int num = 0;
int mun;

With gnu I got

    .cpu arm7tdmi
    .eabi_attribute 20, 1
    .eabi_attribute 21, 1
    .eabi_attribute 23, 3
    .eabi_attribute 24, 1
    .eabi_attribute 25, 1
    .eabi_attribute 26, 1
    .eabi_attribute 30, 2
    .eabi_attribute 34, 0
    .eabi_attribute 18, 4
    .file   "so.c"
    .text
    .comm   mun,4,4
    .global num
    .bss
    .align  2
    .type   num, %object
    .size   num, 4
num:
    .space  4
    .ident  "GCC: (GNU) 8.3.0"

.comm symbol , length

.comm declares a common symbol named symbol. When linking, a common symbol in one object file may be merged with a defined or common symbol of the same name in another object file. If ld does not see a definition for the symbol--just one or more common symbols--then it will allocate length bytes of uninitialized memory. length must be an absolute expression. If ld sees multiple common symbols with the same name, and they do not all have the same size, it will allocate space using the largest size.

When using ELF, the .comm directive takes an optional third argument. This is the desired alignment of the symbol, specified as a byte boundary (for example, an alignment of 16 means that the least significant 4 bits of the address should be zero). The alignment must be an absolute expression, and it must be a power of two. If ld allocates uninitialized memory for the common symbol, it will use the alignment when placing the symbol. If no alignment is specified, as will set the alignment to the largest power of two less than or equal to the size of the symbol, up to a maximum of 16.

The syntax for .comm differs slightly on the HPPA. The syntax is `symbol .comm, length'; symbol is optional.

Assembly language is defined by the assembler not the target. So the answer will be assembler (the tool that reads and assembles assembly language programs) specific and no reason to assume that the answer for one assembler is the same as another. The above is for the gnu assembler, gas.

You could have looked at the documentation you referenced or read other gnu documentation, but the easiest way to answer a "what happens when you do this in a compiled program" is to just compile it and look at the compiler output.

But don't necessarily assume that it isn't initialized:

unsigned int num;
unsigned int fun ( void )
{
    return(num);
}

Just enough to link it:

Disassembly of section .text:

00001000 <fun>:
    1000:   e59f3004    ldr r3, [pc, #4]    ; 100c <fun+0xc>
    1004:   e5930000    ldr r0, [r3]
    1008:   e12fff1e    bx  lr
    100c:   00002000    andeq   r2, r0, r0

Disassembly of section .bss:

00002000 <__bss_start>:
    2000:   00000000

it ends up in bss initialized.

You really want uninitialized access to something then just pick an address (that you know isn't initialized (sram)) and access it:

ldr r0,=0x1234
ldr r0,[r0]

Upvotes: 2

Nate Eldredge
Nate Eldredge

Reputation: 58052

Generally, data that you don't need to initialize should be placed in the .bss section.

    .bss
foobar:
    .skip 99999999

This will allocate 99999999 bytes in the .bss section, and label foobar will be its address. It won't make your object files or executable 99999999 bytes bigger; the executable header just indicates how many bytes of .bss are needed, and at load time, the system allocates an appropriate amount and initializes it to zero.

You can't skip the load-time zero initialization. The system needs to initialize it to something, because it might otherwise contain sensitive data from the kernel or some other process. But zeroing out memory is quite fast, and the kernel will use an efficient algorithm, so I wouldn't worry about the performance impact. It might even zero pages in its idle time, so that when your program loads, there is zeroed memory already available. Anyway, the time your program spends actually using the memory will swamp it.

This means that you can also safely use .bss for data that you do want to have initialized to zero (though not to any nonzero value; if you want int foo = 3; you'll have to put it in .data as in your original example.).

Upvotes: 5

Related Questions