mihnea
mihnea

Reputation: 33

Don't fully understand the output of very small C program using void*/char* to refer to and int + pointer arithmetic

I am having trouble fully understanding why the output of the following code snippet on both onlinegdb.com and visual studio is 2 when I expected a junk/crashed value

I know what little endian and big endian are. My comp clearly should be little endian since it is a 32 bit intel core duo.

I have 512 which in little endian is:

0000 0000 0010 0000 0000 0000 0000 0000
^         ^         ^
p         p+1       p+2

p will point to the first 0000 and p + 1 to the 0010. When I print the 2 is from the 0010, but with %d should it not print the decimal representation of 0010 0000 0000 0000 0000 0000 xyzw rstu (with 8 random bits)?

Q: Did it patch the last bits to 0 by default or was it just luck, took 0 cause they where the last values used? Should it not be undefined?

#include <stdio.h>

int main()
{
    int x = 512;
    void *p = &x;
    printf("%d\n", *((char*)p + 1));
    return 0;
}

Upvotes: 1

Views: 153

Answers (3)

Lundin
Lundin

Reputation: 214730

I have 512 which in little endian is: 0000 0000 0010 0000 0000 0000 0000 0000

Correct, except it is actually 0000 0000 0000 0010. Endianess is byte order, not nibble order.

p + 1 to the 0010

Correct, it points at the byte 0000 0010.

When i print the 2 is from the 0010, but with %d should it not print 0010 0000 ...

No, because when you de-reference the 8 bit char you end with a value 2 inside a 8 bit temporary object.

And as it turns out, variadic functions such a printf promote small integer parameters to int (default argument promotions), regardless of the format specifiers. So you end up with an int with value 2 and then printf prints that value.

Upvotes: 4

Marco Bonelli
Marco Bonelli

Reputation: 69367

With %d should it not print 0010 0000 0000 0000 0000 0000 xyzw rstu (8 random bits)?

No, not in your case.

You are doing this:

*((char*)p + 1)

Which means the following:

  1. Take p and cast it to char *.
  2. Advance 1 position (casted to char * means 1 byte).
  3. Dereference that (still as char *), obtaining 1 byte.
  4. The byte you obtain is 2.
  5. The byte gets now implicitly converted to an int when passed to printf.

In other words, this is what happens:

0000 0000 0100 0000 0000 0000 0000 0000
          ^^^^^^^^^
          (char*)p+1
          vvvvvvvvv
          0100 0000 [0000 0000 0000 0000 0000 0000]
                     promotion to int when calling printf

So this is why you don't see "8 random bits".


On the other hand, if instead you were to do something like this:

// notice the second cast to (int*) AFTER advancing 1 byte
printf("%d\n", *(int*)((char*)p + 1));

Then this would have been undefined behavior. You could have seen some random bits (which means a much bigger value), or you could have gotten zeroes out of luck, but since it's UB you cannot really expect anything meaningful from it.

Upvotes: 4

Thomas Matthews
Thomas Matthews

Reputation: 57739

You may want to make a copy of the byte before printing it:

int main()
{
    const int x = 512;
    uint8_t * p = (uint8_t *)(&x);
    for (unsigned int i = 0u; i < sizeof(int); ++i)
    {
        const uint8_t value = *p++;
        printf("%d\n", value);
    }
    return 0;
}

Making a copy of the memory into a variable prevents the compiler from interpreting the location equation as a 4 byte integer (as in your example).

Upvotes: 2

Related Questions