ideasman42
ideasman42

Reputation: 48228

Is there a way to set a variable as uninitialized in GCC/Clang?

I would be interested to know if its possible to explicitly taint a variable in C, as being uninitialized.

Pseudo code...

{
    int *array;
    array = some_alloc();
    b = array[0];
    some_free(array);
    TAINT_MACRO(array);

    /* the compiler should raise an uninitialized warning here */
    b = array[0];
}

Here is one example of one way to taint a variable, but GCC is raising a warning when 'a' is assigned the uninitialized var, rather then the second use of 'a'.

{
    int a = 10;
    printf("first %d\n", a);
    do {
        int b;
        a = b;
    } while(0);
    printf("second %d\n", a);
}

The only solution I could come up with is to explicitly shadow the variable with an uninitialized one, (voids are added so there are no unused warnings).

#define TAINT_MACRO_BEGIN(array) (void)(array); { void **array; (void)array;
#define TAINT_MACRO_END(array) } (void)(array);
{
    int *array;
    array = some_alloc();
    b = array[0];
    some_free(array);
    TAINT_MACRO_BEGIN(array);

    /* the compiler should raise an uninitialized warning here */
    b = array[0];
    TAINT_MACRO_END(array);
}

This method adds too much overhead to include in existing code (adds a lot of noise and annoying to maintain), so I was wondering if there was some other way to tell the compiler a variable is uninitialized.

I know there are static checkers and I do use these, but Im looking for something the that can give a warning at compile time and without false positives which I believe is possible in this case and can avoid a certain class of bugs.

Upvotes: 10

Views: 1806

Answers (3)

jxh
jxh

Reputation: 70492

Based on an answer to a different question, you can use setjmp and longjmp to make a changed local variable have an indeterminate value.

#define TAINT(x)                             \
        do {                                 \
            static jmp_buf jb;               \
            if (setjmp(jb) == 0) {           \
                memset(&x, '\0', sizeof(x)); \
                longjmp(jb, 1);              \
            }                                \
        } while (0)

If x is a local variable, it's value will be indeterminate in the lines of code after TAINT is applied to it. This is because of C.11 §7.13.2 ¶3 (emphasis mine):

All accessible objects have values, and all other components of the abstract machine have state, as of the time the longjmp function was called, except that the values of objects of automatic storage duration that are local to the function containing the invocation of the corresponding setjmp macro that do not have volatile-qualified type and have been changed between the setjmp invocation and longjmp call are indeterminate.

Note that no diagnostic is required for using a variable that is so tainted. However, compiler writers are aggressively detecting undefined behavior to enhance optimization, and so I would be surprised if this remains undiagnosed forever.

Upvotes: 2

AnthonyFoiani
AnthonyFoiani

Reputation: 514

I sent an answer on the GCC list, but since I use SO first myself...

In modern C and C++, I would expect programmers to use limited variable scope to control this kind of exposure.

For example, I think you want something like this (note that the attribute I'm using doesn't actually exist, I'm just trying to paraphrase your request).

int x = 1; // initialized 
int y;     // uninitialized 

x = y;     // use of uninitialized value 'y' 

y = 2;     // no longer uninitialized 
x = y;     // fine 

y = ((__attr__ uninitialized))0; // tell gcc it's uninitialized again 

x = y;    // warn here please. 

If so, I would use additional scopes in C99 (or later) or C++ (pretty sure it's had "declare at point of use" since at least ARM in 1993...):

int x = 1; // initialized 

{ 
    int y; // uninitialized 
    x = y; // warn here 
    y = 2; // ok, now it's initialized 
    x = y; // fine, no warning 
} 

{ 
    int y; // uninitialized again! 
    x = y; // warns here 
} 

The extra scopes are a bit off-putting, but I'm very used to them in C++ (from heavy use of RAII techniques.)

Since there is an answer for this in mainstream languages, I don't think it's worth adding to the compiler.

Looking at your example, you're concerned with an array. That should work just as well with the extra scopes, and there should be no extra runtime cost, since the entire stack frame is allocated on function entry (SFAIK, at least).

Upvotes: 4

jxh
jxh

Reputation: 70492

I would go the other way around, and wrap taint macros around the allocation and free functions. This is what I have in mind:

#ifdef O_TAINT
volatile int taint_me;
#define TAINT(x, m) \
    if (taint_me) { goto taint_end_##x; } else {} x = m
#define free(x) free(x); taint_end_##x: (void)0
#else
#define TAINT(x, m) x = m
#endif

So, your example would look like this:

int *array;
int b;

TAINT(array, malloc(sizeof(int)));
b = array[0];
printf("%d\n", b);
free(array);

/* the compiler should raise an uninitialized warning here */
b = array[0];
printf("%d\n", b);

This isn't perfect. There can only be one call to free() per tainted variable, because the goto label is tied to the variable name. If the jump skips over other initializations, you may get other false positives. It doesn't work if the allocation occurs in one function, and the memory freed in a different function.

But, it provides the behavior that you asked for your example. When compiled normally, no warnings would appear. If compiled with -DO_TAINT, a warning will appear at the second assignment to b.


I did work out a fairly general solution, but it involves bracketing the whole function with begin/end macros, and relies on the GCC extension typeof operator. The solution ends up looking like this:

void foo (int *array, char *buf)
{
    TAINT_BEGIN2(array, buf);
    int b;

    puts(buf);
    b = array[0];
    printf("%d\n", b);

    free(array);
    free(buf);

    /* the compiler should raise an uninitialized warning here */
    puts(buf);
    b = array[0];
    printf("%d\n", b);

    TAINT_END;
}

Here, TAINT_BEGIN2 is used to declare the two function parameters that will get the taint treatment. Unfortunately, the macros are kind of a mess, but easy to extend:

#ifdef O_TAINT
volatile int taint_me;
#define TAINT(x, m) \
    if (taint_me) { goto taint_end_##x; } else {} x = m
#define TAINT1(x) \
    if (taint_me) { goto taint_end_##x; } else {} x = x##_taint
#define TAINT_BEGIN(v1) \
    typeof(v1) v1##_taint = v1; do { \
    typeof(v1##_taint) v1; TAINT1(v1)
#define TAINT_BEGIN2(v1, ...) \
    typeof(v1) v1##_taint = v1; TAINT_BEGIN(__VA_ARGS__); \
    typeof(v1##_taint) v1; TAINT1(v1)
#define TAINT_BEGIN3(v1, ...) \
    typeof(v1) v1##_taint = v1; TAINT_BEGIN2(__VA_ARGS__); \
    typeof(v1##_taint) v1; TAINT1(v1)
#define TAINT_END } while(0)
#define free(x) free(x); taint_end_##x: (void)0
#else
#define TAINT_BEGIN(x) (void)0
#define TAINT_BEGIN2(...) (void)0
#define TAINT_BEGIN3(...) (void)0
#define TAINT_END (void)0
#define TAINT1(x) (void)0
#define TAINT(x, m) x = m
#endif

Upvotes: 1

Related Questions