Somjit
Somjit

Reputation: 2772

C - Using extern to access global variable. Case study

I thought externs were to share variables between compilation units. Why does the below code work ? and how does it work exactly ? Is this good practice ?

#include <stdio.h>
int x = 50;

int main(void){ 
    int x = 10; 
    printf("Value of local x is %d\n", x);

    {
        extern int x;
        printf("Value of global x is %d\n", x);
    }

    return 0; 
}

Prints out :

Value of local x is 10
Value of global x is 50 

Upvotes: 1

Views: 352

Answers (2)

Jonathan Leffler
Jonathan Leffler

Reputation: 753665

You might (or might not) be interested to know that GCC (4.9.1) and clang (Apple LLVM version 6.0 (clang-600.0.57) (based on LLVM 3.5svn)) have divergent views on the acceptability of the following code, which is a minor adaptation of the code in the question:

#include <stdio.h>
static int x = 50;  // static instead of no storage class specifier

int main(void)
{
    int x = 10;
    printf("Value of local x is %d\n", x);

    {
        extern int x;
        printf("Value of global x is %d\n", x);
    }

    return 0;
}

I called the source file ext.c.

$ clang -O3 -g -std=c11 -Wall -Wextra -Werror ext.c -o ext
$ gcc   -O3 -g -std=c11 -Wall -Wextra -Werror ext.c -o ext
ext.c: In function ‘main’:
ext.c:9:20: error: variable previously declared ‘static’ redeclared ‘extern’
         extern int x;
                    ^
ext.c: At top level:
ext.c:2:12: error: ‘x’ defined but not used [-Werror=unused-variable]
 static int x = 50;
            ^
cc1: all warnings being treated as errors
$

The problem is to determine which compiler is correct because they can't both be right unless the program is exhibiting undefined behaviour — which, if you bother to read to the end, will turn out to be the case.

The relevant section of the C11 standard is:

6.2.2 Linkages of identifiers

¶1 An identifier declared in different scopes or in the same scope more than once can be made to refer to the same object or function by a process called linkage.29) There are three kinds of linkage: external, internal, and none.

¶2 In the set of translation units and libraries that constitutes an entire program, each declaration of a particular identifier with external linkage denotes the same object or function. Within one translation unit, each declaration of an identifier with internal linkage denotes the same object or function. Each declaration of an identifier with no linkage denotes a unique entity.

¶3 If the declaration of a file scope identifier for an object or a function contains the storage class specifier static, the identifier has internal linkage.30)

This means that the first or outermost declaration (definition) of x in the code above has internal linkage.

4 For an identifier declared with the storage-class specifier extern in a scope in which a prior declaration of that identifier is visible,31) if the prior declaration specifies internal or external linkage, the linkage of the identifier at the later declaration is the same as the linkage specified at the prior declaration. If no prior declaration is visible, or if the prior declaration specifies no linkage, then the identifier has external linkage.

This paragraph needs detailed deconstruction below.

¶5 If the declaration of an identifier for a function has no storage-class specifier, its linkage is determined exactly as if it were declared with the storage-class specifier extern. If the declaration of an identifier for an object has file scope and no storage-class specifier, its linkage is external.

In the original code in the question, the second sentence says that the first declaration (definition) of x has external linkage.

¶6 The following identifiers have no linkage: an identifier declared to be anything other than an object or a function; an identifier declared to be a function parameter; a block scope identifier for an object declared without the storage-class specifier extern.

The x declared (defined) at the start of the function is 'a block scope identifier …' and therefore has no linkage.

¶7 If, within a translation unit, the same identifier appears with both internal and external linkage, the behavior is undefined.

29) There is no linkage between different identifiers.
30) A function declaration can contain the storage-class specifier static only if it is at file scope; see 6.7.1.
31) As specified in 6.2.1, the later declaration might hide the prior declaration.


Dissecting paragraph 4

Paragraph 4 is the key one here. Restating it and annotating it:

4 For an identifier declared with the storage-class specifier extern in a scope in which a prior declaration of that identifier is visible,31)

The third or innermost declaration of x is declared in a scope in which a prior declaration of that identifier is visible — the int x = 10; declaration is visible (the static int x = 50; declaration is invisible, having been shadowed by the visible declaration). The footnote refers to §6.2.1 Scopes of identifiers but I don't think ithat says anything surprising (however, I'll quote the relevant paragraphs — ¶2 and ¶4 — if you think that's necessary).

if the prior declaration specifies internal or external linkage, the linkage of the identifier at the later declaration is the same as the linkage specified at the prior declaration.

This does not apply; the prior declaration specifies neither internal nor external linkage.

If no prior declaration is visible, or if the prior declaration specifies no linkage,

There is a prior declaration that's visible, and that declaration specifies no linkage.

then the identifier has external linkage.

So, the innermost x has external linkage, the outermost x has internal linkage, and as a consequence, paragraph 7 says the resulting behaviour is undefined. That means that both compilers are correct; if the behaviour is undefined, any behaviour is correct — and different compilers are allowed to have divergent views on what is correct, and GCC and clang exhibit divergent views. On the whole, GCC's "it is a problem that should be reported" view is safer for the programmer.

In the original code, the outermost x has external linkage, the innermost x also has external linkage, and as a consequence paragraph 7 does not apply, and the innermost declaration of x refers to the outermost declaration (and definition) of x.

Apart from showing that interpreting the standard is hard work, this whole answer (diatribe) also shows that using multiple compilers (if possible on different platforms) is a good idea. It gives you the maximum chance of finding problems. Depending on a single compiler leaves you vulnerable to missing problems that another compiler might spot.

Upvotes: 4

Kelm
Kelm

Reputation: 968

When you use the extern keyword, the linker finds a symbol with a matching name in object files / libraries / archives. Symbols are, simply speaking, functions and global variables (local variables are just some space on the stack), thus the linker can do it's magic here.

About it being a good practice - global variables in general are not considered a good practice since they cause spaghetti code and 'pollute' the symbols pool.

Upvotes: 4

Related Questions