ARF
ARF

Reputation: 7694

C block scoping

I am trying to understand the implications of block scoping in C.

I realise that identifiers defined within a scope are invisible outside the scope but what are the implications of block scoping at an instruction level? Does entry into or exit from a block scope imply any instructions or is it entirely transparent at an instruction value? Are variable defined inside a scope destroyed like they are within a loop constuct?

At an instruction level, after optimizing, is the following:

initialise:
    int a = 0;
block_entry:
    a += 1;
    /* on first pass (initialisation): a == 1 */
    /* on second pass (entry by goto): a==2 ? */
    if (a==2): goto done

goto block_entry
done:

any different from:

{
initialise:
    int a = 0;
block_entry:
    a += 1;
    /* on first pass (initialisation): a == 1 */
    /* on second pass (entry by goto): a==2 ? */
    if (a==2): goto done
}

goto block_entry
done:

or from:

while(1){
initialise:
    int a = 0;
block_entry:
    a += 1;
    /* on first pass (initialisation): a == 1 */
    /* on second pass (entry by goto): a == 2 ? */
    if (a==2): goto done
    goto main_code
}

main_code:
goto block_entry
done:

The question is largely academic and inspired by Eli Bendersky's post "Computed goto for efficient dispatch tables" where he seems to use a while(1) {...} loop purely for visual structuring. (In the interp_cgoto(...) function specifically.)

Would his code perform compile any different if he were to use a block scope for visual structuring or no scoping at all? (I.e. removing the while(1) {...} loop.)

Upvotes: 1

Views: 1068

Answers (3)

rici
rici

Reputation: 241771

The behaviour of snippets two and three is undefined because the lifetime of variable a ends when the block in which it is declared is exited (even if the exit is by means of a goto). When the block is re-entered, a new a is created, with an initially indeterminate value. Since the declaration statement is skipped by the goto, the value of a continues to be indeterminate. Subsequently attempting to use that value (a += 1;) results in undefined behaviour.

Here's an example which actually seems to demonstrate the undefined behaviour in practice:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char** argv) {
    {
initialise:;
        int a[10] = {0};
block_entry:
        a[0] += 1;
        printf("a is %d\n", a[0]);
        /* on first pass (initialisation): a == 1 */
        /* on second pass (entry by goto): a==2 ? */
        if (a[0]>=2) goto done;
    }
    {
        int x[10];
        x[0] = argc > 1 ? atoi(argv[1]) : 42;
        printf("x is %d\n", x[0]);
    }

    goto block_entry;
done:
    puts("Done");
    return 0;
}

(Live on coliru)

I fixed a couple of typos (where the pseudocode was a mix of C and Python),, and added another block where the stack might be reused. And I changed the termination condition to >=, for reasons which might be evident.

Within the precise version of gcc, etc., this results in a[0] and x[0] sharing storage, so the second time through the loop a is 43 instead of 2.

If you change the size of the arrays to something smaller, then gcc doesn't put them at the same stack location, and you get the behaviour of the original snippet, where a is 2 on the second pass.

On the other hand, if you use -O3 instead of -O0, then gcc compiles an endless loop where a is always 1.

All of these results are acceptable, because undefined behaviour puts no constraints on the compiler.

In short, Don't Do That (sm).

Upvotes: 1

M.M
M.M

Reputation: 141618

In your second and third snippets, goto block_entry; leads to undefined behaviour; whereas the first snippet is OK (a == 2 on the second pass).

If you goto from outside a block into a block, after a variable declaration with initializer; then that variable exists but the initializer was not applied; the variable behaves like an uninitialized variable.

Variables defined inside a block are conceptually destroyed when the block exits. This usually won't translate to any actual assembly instructions, it will just be reflected in where on the stack you would find the different variables within the function.

Upvotes: 1

mksteve
mksteve

Reputation: 13073

The C language does not support constructors and destructors. So entering and leaving scopes does not cause "destructors" to be called.

Variables within different scopes can share the same memory or register, so the following code

{
    char buffer[2048];
    /*...*/
}
{
   char stuff[2048];
   /*....*/
}

May use up 2k or 4k of stack depending on decisions by the compiler. Notionally it could create

union {
   char buff[2048];
   char stuff[2048];
};

So creating scopes allows the stack and register requirements of a function to be shrunk if the compiler deems it necessary. I can't see such an advantage in your code.

Upvotes: 1

Related Questions