user479870
user479870

Reputation:

Why is the compiler confused into a SIGSEGV by unsigned int?

I have distilled a problem I had which made me figure out what is happening, but still not exactly why.

int main() {
    unsigned int a = 2;
    char c[2] = {};
    char* p = &c[1];
    return p[1 - a];
}

It is a bit clearer when the last line is rewritten.

    return *(p + (1 - a));      /* equivalent */
    return *(p + 1 - a);        /* works */
    return *(p + (1 - (int)a)); /* works */

I'm surprised that the compiler doesn't remove the parenthesis internally. And more so that it apparently tries to hold a temporary negative result of type unsigned int. Unless that's not the reason for segmentation fault here. In the assembler output there is only little difference between code with and without parenthesis.

-   movl    $1, %eax
-   subl    -12(%rbp), %eax
-   movl    %eax, %edx
+   movl    -12(%rbp), %eax
+   movl    $1, %edx
+   subq    %rax, %rdx

Upvotes: 1

Views: 167

Answers (2)

sfstewman
sfstewman

Reputation: 5677

This is all about the C coercion rules. The expression 1-a is treated as an unsigned int, and results in an underflow. The compiler cannot remove the parentheses because you're mixing types. Consider your cases:

return *(p + (1 - a));      /* equivalent */

Calculates 1-a first, but treats it as an unsigned int. This underflows the unsigned type, and returns the maximum value for an unsigned int. This is then added to the pointer, resulting in a dereferencing a pointer to something like p+(1<<31), if unsigned int is 32-bit. This is not likely to be a valid memory location.

return *(p + 1 - a);        /* works */

This calculates p+1 and then subtracts a from it, resulting in dereferencing p-1. This is technically undefined behavior, but will probably (in most implementations) reference a valid memory location on the stack.

return *(p + (1 - (int)a)); /* works */

This coerces a to a signed int, and then calculates 1-a, which is -1. You then dereference p-1.

Upvotes: 3

Paul R
Paul R

Reputation: 212979

The problem is that in the expression 1 - a, the 1 gets promoted to unsigned int, so you have 1U - 2U which underflows to UINT_MAX. The take-home message here is that you always have to be very careful when mixing signed and unsigned ints in the same expression.

Not that a good compiler may warn you about such usages, provided you have warnings enabled of course:

main.c: In function 'int main()':
main.c:5:19: warning: '*((void*)& c +4294967296)' is used uninitialized in this function [-Wuninitialized]
     return p[1 - a];

Upvotes: 5

Related Questions