S.S. Anne
S.S. Anne

Reputation: 15576

How are the symbols for static variables named in GCC?

I'm experimenting with static globals in C. I tried this code and ran nm on it:

#include <stdio.h>

static int global_static = 12345;

int main(void)
{
    static int local_static = 12345;
    printf("%d\n", global_static);
    printf("%d\n", local_static);
    return 0;
}

Here's a snippet of the nm output:

00004020 d global_static
00004024 d local_static.1905
000011a9 T main

I have two questions about this:

When I say invalid characters, I mean characters that can't be part of variable names in C, i.e. ., $, %, #, etc.

Upvotes: 3

Views: 1985

Answers (2)

aschepler
aschepler

Reputation: 72311

Where does the name for the local static variable come from? Is it a process ID or a random number?

In gcc's langhooks.c, the default set_decl_assembler_name hook implementation (which is used directly for the C language) contains:

  /* By default, assume the name to use in assembly code is the same
     as that used in the source language.  (That's correct for C, and
     GCC used to set DECL_ASSEMBLER_NAME to the same value as
     DECL_NAME in build_decl, so this choice provides backwards
     compatibility with existing front-ends.  This assumption is wrapped
     in a target hook, to allow for target-specific modification of the
     identifier.
     Can't use just the variable's own name for a variable whose scope
     is less than the whole compilation.  Concatenate a distinguishing
     number - we use the DECL_UID.  */
  if (TREE_PUBLIC (decl) || DECL_FILE_SCOPE_P (decl))
    id = targetm.mangle_decl_assembler_name (decl, DECL_NAME (decl));
  else
    {
      const char *name = IDENTIFIER_POINTER (DECL_NAME (decl));
      char *label;
      ASM_FORMAT_PRIVATE_NAME (label, name, DECL_UID (decl));
      id = get_identifier (label);
    }

And the comment on the macro DECL_UID says:

/* Every ..._DECL node gets a unique number.  */

So the number is some identifier invented by gcc which is guaranteed to be different for every declaration seen in the translation unit (including declarations in #include-d files). This is enough to make sure if different scopes use local static variables with the same name, they will have different mangled symbol names in the assembly and object code.

Does the fact that global_static has no invalid characters in it imply that I could do extern static int global_static; in another file and read global_static?

No. For one thing, it's illegal to combine extern and static, since these give conflicting linkages to the variable. Note that static has two entirely different meanings in C: Inside a function, it means the variable has static storage duration. Outside a function, it means the variable or function has internal linkage. (A variable which is not local to a function always has static storage duration.)

So from the C language point of view, the static on global_static means that the variable has internal linkage, which means it is never to be considered the same variable as anything in any other translation unit, so there is no way to directly access it from another *.c file. When translating to ELF objects or other common object formats, this is done by making the symbol for the variable a "local" symbol instead of a "global" symbol. When linking executables or loading dynamic libraries, a global symbol can satisfy an undefined symbol from another object, but a local symbol never does.

Note the nm tool prints uppercase symbol type letters for global symbols and lowercase symbol type letters for local symbols, so the d next to the variables in your output means that both are local symbols and cannot possibly be directly used by other objects.

Upvotes: 3

R.. GitHub STOP HELPING ICE
R.. GitHub STOP HELPING ICE

Reputation: 215259

  1. It's a unique identifier within the translation unit (source/object file), so that same-named static with different local scopes would not refer to the same object.

  2. No. Symbols not tagged as global in an assembly/object file cannot be used to resolve references from other files at link time; they're ignored. (The lowercase d from nm indicates that it's a local symbol not a global one.) Within the same assembly/object file, the C source level rule that you can't have both external and static objects with same identifier at file scope rules it out.

Upvotes: 2

Related Questions