Reputation: 1147
I'm using GCC on my Raspberry Pi to compile some assembly code for a course I'm taking. It is my understanding from information in the GNU Assembler Reference that I can reproduce the following C code in GNU ARM Assembly:
int num = 0;
By writing this:
.data
num: .word 0
Great! Now how would I write this?
int num;
It is my understanding that leaving a variable uninitialized like this means I should treat it as containing whatever garbage value was in the memory location before. Therefore, I shouldn't use it before I've given it a value in some way.
But suppose for some reason I intended to store a huge amount of data in memory and needed to reserve a massive amount of space for it. It seems to me like it would be a massive waste of resources to initialize the entire area of memory to some value if I'm about to fill it with some data anyways. Yet from what I can find there seems to be no way to make a label in GCC ARM Assembly without initializing it to some value. According to my assembly textbook the .word
directive can have zero expressions after it, but if used this way "then the address counter is not advanced and no bytes are reserved." My first though was to use the ".space" or ".skip" directives instead, but for this directive even the official documentation says that "if the comma and fill are omitted, fill is assumed to be zero."
Is there no way for me to reserve a chunk of memory without initializing it in GCC ARM Assembly?
Upvotes: 2
Views: 2665
Reputation: 71526
What happened when you tried it?
When I tried it:
int num = 0;
int mun;
With gnu I got
.cpu arm7tdmi
.eabi_attribute 20, 1
.eabi_attribute 21, 1
.eabi_attribute 23, 3
.eabi_attribute 24, 1
.eabi_attribute 25, 1
.eabi_attribute 26, 1
.eabi_attribute 30, 2
.eabi_attribute 34, 0
.eabi_attribute 18, 4
.file "so.c"
.text
.comm mun,4,4
.global num
.bss
.align 2
.type num, %object
.size num, 4
num:
.space 4
.ident "GCC: (GNU) 8.3.0"
.comm symbol , length
.comm declares a common symbol named symbol. When linking, a common symbol in one object file may be merged with a defined or common symbol of the same name in another object file. If ld does not see a definition for the symbol--just one or more common symbols--then it will allocate length bytes of uninitialized memory. length must be an absolute expression. If ld sees multiple common symbols with the same name, and they do not all have the same size, it will allocate space using the largest size.
When using ELF, the .comm directive takes an optional third argument. This is the desired alignment of the symbol, specified as a byte boundary (for example, an alignment of 16 means that the least significant 4 bits of the address should be zero). The alignment must be an absolute expression, and it must be a power of two. If ld allocates uninitialized memory for the common symbol, it will use the alignment when placing the symbol. If no alignment is specified, as will set the alignment to the largest power of two less than or equal to the size of the symbol, up to a maximum of 16.
The syntax for .comm differs slightly on the HPPA. The syntax is `symbol .comm, length'; symbol is optional.
Assembly language is defined by the assembler not the target. So the answer will be assembler (the tool that reads and assembles assembly language programs) specific and no reason to assume that the answer for one assembler is the same as another. The above is for the gnu assembler, gas.
You could have looked at the documentation you referenced or read other gnu documentation, but the easiest way to answer a "what happens when you do this in a compiled program" is to just compile it and look at the compiler output.
But don't necessarily assume that it isn't initialized:
unsigned int num;
unsigned int fun ( void )
{
return(num);
}
Just enough to link it:
Disassembly of section .text:
00001000 <fun>:
1000: e59f3004 ldr r3, [pc, #4] ; 100c <fun+0xc>
1004: e5930000 ldr r0, [r3]
1008: e12fff1e bx lr
100c: 00002000 andeq r2, r0, r0
Disassembly of section .bss:
00002000 <__bss_start>:
2000: 00000000
it ends up in bss initialized.
You really want uninitialized access to something then just pick an address (that you know isn't initialized (sram)) and access it:
ldr r0,=0x1234
ldr r0,[r0]
Upvotes: 2
Reputation: 58052
Generally, data that you don't need to initialize should be placed in the .bss
section.
.bss
foobar:
.skip 99999999
This will allocate 99999999 bytes in the .bss
section, and label foobar
will be its address. It won't make your object files or executable 99999999 bytes bigger; the executable header just indicates how many bytes of .bss
are needed, and at load time, the system allocates an appropriate amount and initializes it to zero.
You can't skip the load-time zero initialization. The system needs to initialize it to something, because it might otherwise contain sensitive data from the kernel or some other process. But zeroing out memory is quite fast, and the kernel will use an efficient algorithm, so I wouldn't worry about the performance impact. It might even zero pages in its idle time, so that when your program loads, there is zeroed memory already available. Anyway, the time your program spends actually using the memory will swamp it.
This means that you can also safely use .bss
for data that you do want to have initialized to zero (though not to any nonzero value; if you want int foo = 3;
you'll have to put it in .data
as in your original example.).
Upvotes: 5