k1r1t0
k1r1t0

Reputation: 769

static and volatile keywords from assembly point of view

I know that there are many questions like this, but this question is not about what static and volatile means from C standard's point of view. I'm interested in what is happening a bit lower - on assembly level.

static keyword for variables makes those variables to be visible statically (static storage duration), like global variables. To make it real a compiler should write those variables to .bss section or somewhere else? Also, static keyword prevents the variable/function to be used outside the file, does it happen only during compilation or there are some runtime-checks?

volatile keyword for variables makes those variables to be read from memory to make sure that if something else (like Peripheral Devices) wants to modify that variable it'll see exactly the value from that memory. Here, what does exactly mean "to be read from memory"? What is memory location used? .bss, .data, or something else?

Upvotes: 1

Views: 460

Answers (2)

old_timer
old_timer

Reputation: 71516

You can also try it and see.

C code:

static unsigned int x;
unsigned int y;
unsigned int z = 1;
static volatile unsigned int j;
static volatile const unsigned int k = 11;

void fun ( void )
{
    x = 5;
    y = 7;
    z ++ ;
    j+=2;
}

Assembler:

mov ip, #7
ldr r3, .L3
ldr r0, .L3+4
ldr r2, [r3, #4]
ldr r1, [r0]
add r2, r2, #2
add r1, r1, #1
str r1, [r0]
str r2, [r3, #4]
str ip, [r3]
bx  lr


    .global z
    .global y
    .data
    .align  2
    .set    .LANCHOR1,. + 0
    .type   z, %object
    .size   z, 4
z:
    .word   1
    .type   k, %object
    .size   k, 4
k:
    .word   11
    .bss
    .align  2
    .set    .LANCHOR0,. + 0
    .type   y, %object
    .size   y, 4
y:
    .space  4
    .type   j, %object
    .size   j, 4
j:
    .space  4

x was not expected to survive in an example like this and in any file it will maybe land in .bss since I did not put an initial value.

y is .bss as expected

z is .data as expected

volatile prevents j from being optimized out despite it being dead code/variable.

k could have ended up in .rodata but looks like .data here.

You guys are using fancy words but static in C just means it is limited in scope limited to that function or file. Global, local, initialized or not, const or not, can affect if it is .data, .bss, or .rodata (could even land in .text instead of .rodata if you play the alphabet game with the (rwx) stuff in the linker script (suggestion: never use those).

volatile is implied to mean some flavors of do not optimize out this variable/operation, do it in this order do not move it outside the loop, etc. You can find discussions about how it is not what you think it is and we have seen on this site that llvm/clang and gnu/gcc have a different opinion on what volatile actually means (when used to describe a pointer that is intended to access a control or status register in a peripheral, based on some arguments as what volatile was invented for (not for sharing variables between interrupts and foreground code)).

Like static volatile does not imply what segment it is in can even be used with asm volatile (stuff); to tell the compiler I do not want you to move this code around I want it to happen right here in this order. (which is an aspect of using it on a variable, or so we believe).

static unsigned int x;
void fun ( void )
{
    x = 5;
}

Disassembly of section .text:

00000000 <fun>:
   0:   e12fff1e    bx  lr

no .rodata, .data, nor .bss optimized away.

but

static unsigned int x;
void fun ( void )
{
    x += 5;
}

Disassembly of section .text:

00000000 <fun>:
   0:   e59f200c    ldr r2, [pc, #12]   ; 14 <fun+0x14>
   4:   e5923000    ldr r3, [r2]
   8:   e2833005    add r3, r3, #5
   c:   e5823000    str r3, [r2]
  10:   e12fff1e    bx  lr
  14:   00000000    andeq   r0, r0, r0

Disassembly of section .bss:

00000000 <x>:
   0:   00000000    andeq   r0, r0, r0

How fun is that, ewww... let's not optimize out the dead code, let's put it in there. It is not global, nobody else can see it...

fun.c

static unsigned int x;
void fun ( void )
{
    x += 5;
}

so.c

static unsigned int x;
void more_fun ( void )
{
    x += 3;
}

linked

Disassembly of section .text:

00008000 <more_fun>:
    8000:   e59f200c    ldr r2, [pc, #12]   ; 8014 <more_fun+0x14>
    8004:   e5923000    ldr r3, [r2]
    8008:   e2833003    add r3, r3, #3
    800c:   e5823000    str r3, [r2]
    8010:   e12fff1e    bx  lr
    8014:   00018030    andeq   r8, r1, r0, lsr r0

00008018 <fun>:
    8018:   e59f200c    ldr r2, [pc, #12]   ; 802c <fun+0x14>
    801c:   e5923000    ldr r3, [r2]
    8020:   e2833005    add r3, r3, #5
    8024:   e5823000    str r3, [r2]
    8028:   e12fff1e    bx  lr
    802c:   00018034    andeq   r8, r1, r4, lsr r0

Disassembly of section .bss:

00018030 <x>:
   18030:   00000000    andeq   r0, r0, r0

00018034 <x>:
   18034:   00000000    andeq   r0, r0, r0

each x is static so as expected there are two of them... well expectations are they are optimized out but...

and they are .bss as expected since I did not initialize them.

and on that note

static unsigned int x=3;
void fun ( void )
{
    x += 5;
}

Disassembly of section .text:

00000000 <fun>:
   0:   e59f200c    ldr r2, [pc, #12]   ; 14 <fun+0x14>
   4:   e5923000    ldr r3, [r2]
   8:   e2833005    add r3, r3, #5
   c:   e5823000    str r3, [r2]
  10:   e12fff1e    bx  lr
  14:   00000000    andeq   r0, r0, r0

Disassembly of section .data:

00000000 <x>:
   0:   00000003    andeq   r0, r0, r3



static const unsigned int x=3;
unsigned int fun ( void )
{
    return(x);
}

Disassembly of section .text:

00000000 <fun>:
   0:   e3a00003    mov r0, #3
   4:   e12fff1e    bx  lr

static const unsigned int x=3;
const unsigned int y=5;
unsigned int fun ( void )
{
    return(x+y);
}

Disassembly of section .text:

00000000 <fun>:
   0:   e3a00008    mov r0, #8
   4:   e12fff1e    bx  lr

Disassembly of section .rodata:

00000000 <y>:
   0:   00000005    andeq   r0, r0, r5

Okay I finally got a .rodata.

static const unsigned int x=3;
volatile const unsigned int y=5;
unsigned int fun ( void )
{
    return(x+y);
}

Disassembly of section .text:

00000000 <fun>:
   0:   e59f3008    ldr r3, [pc, #8]    ; 10 <fun+0x10>
   4:   e5930000    ldr r0, [r3]
   8:   e2800003    add r0, r0, #3
   c:   e12fff1e    bx  lr
  10:   00000000    andeq   r0, r0, r0

Disassembly of section .data:

00000000 <y>:
   0:   00000005    andeq   r0, r0, r5

There is only so much you can do with words and their (perceived) definitions, the topic as I understand it is C vs (generated) asm. At some point you should actually try it and you can see how trivial it was, do not need to write elaborate code. gcc, objdump and sometimes ld. Hmm I just noticed y moved to .data from .rodata in that case... That is interesting.

And this just try it will test the compiler and other tool authors interpretation. Things like what does register mean what does volatile mean, etc (and to find that it is subject to different interpretations like so much of the C language (implementation defined)). It is important sometimes to know what your favorite/specific compilers interpretation of the language is, but be very mindful of actual implementation defined things (bitfields, unions, how structs are constructed (packing them causes as many problems as it solves) and so on)...

Go to the spec read whatever definition, then go to your compiler and see how they interpreted it, then go back to the spec and see if you can figure it out.

As far as static goes essentially means scope, stays within the function or file (well compile domain for a single compile operation). and volatile implies please do this in this order and please do not optimize out this item and/or its operations. in both cases it is what you used them with that determines where they are .text, .data, .bss, .rodata, etc.

Upvotes: 2

fuz
fuz

Reputation: 92966

The static keyword has two meanings: (a) it conveys static storage class and (b) it conveys internal linkage. These two meanings have to be strictly distinguished.

An object having static storage class means that it is allocated at the start of the program and lives until the end of the program. This is usually achieved by placing the object into the data segment (for initialised objects) or into the bss segment (for uninitialised objects). Details may vary depending on the toolchain in question.

An identifier having internal linkage means that each identifier in the same translation unit with the same name and some linkage (i.e. the linkage is not “none”) refers to the same object as that identifier. This is usually realised by not making the symbol corresponding to the identifier a global symbol. The linker will then not recognise references of the same symbol from different translation units as referring to the same symbol.

The volatile keyword indicates that all operations performed on the volatile-qualified object in the abstract machine must be performed in the code generated. The compiler is not permitted to perform any optimisations that would discard any such operations performed on the volatile-qualified object as it usually would for non-volatile-qualified objects.

This keyword is purely an instruction to the compiler to suppress certain optimisations. It does not affect the storage class of the objects qualified such. See also my previous answer on this topic.

Upvotes: 8

Related Questions