George Maher
George Maher

Reputation: 49

problem with Casting in C ( (int*) (char*))

so here is the code, till the 4th print out I easily followed it, but at the 5th printout, I don't understand

why its "5: a[0] = 200, a[1] = 128144, a[2] = 256, a[3] = 302 "?

I got the point of a[1] but I'm still thinking about how a[2] come 256?

#include <stdio.h>
#include <stdlib.h>

void f(void)
{
    int a[4];
    int *b = malloc(16);
    int *c;
    int i;

    printf("1: a = %p, b = %p, c = %p\n", a, b, c);

    c = a;
    for (i = 0; i < 4; i++)
    a[i] = 100 + i;
    c[0] = 200;
    printf("2: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
       a[0], a[1], a[2], a[3]);

    c[1] = 300;
    *(c + 2) = 301;
    3[c] = 302;
    printf("3: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
       a[0], a[1], a[2], a[3]);

    c = c + 1;
    *c = 400;
    printf("4: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
       a[0], a[1], a[2], a[3]);

    c = (int *) ((char *) c + 1);
    *c = 500;
    printf("5: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
       a[0], a[1], a[2], a[3]);

    b = (int *) a + 1;
    c = (int *) ((char *) a + 1);
    printf("6: a = %p, b = %p, c = %p\n", a, b, c);
    
    
}

int
main(int ac, char **av)
{
    f();
    return 0;
}

Upvotes: 2

Views: 244

Answers (2)

selbie
selbie

Reputation: 104474

This line:

c = (int *) ((char *) c + 1);

May be putting you into undefined behavior (technically), but it can be reliably explained on an Intel processor.

Your 4th print statement shows this:

4: a[0] = 200, a[1] = 400, a[2] = 301, a[3] = 302

Assuming little Endian for Intel chips and sizeof(int)==4for int (most likely), your memory layout at the addresses referenced by a and c are as follows. c points to &a[1] at this point.

C8 00 00 00 90 01 00 00 2D 01 00 00 2E 01 00 00 XX XX XX
[   A[0]  ] [   A[1]  ] [   A[2]  ] [  A[3]   ]
            [   C[0]  ] [   C[1]  ] [  C[2]   ]

So when you say c = (int *) ((char *) c + 1);, you've shifted your view of the memory array above by 1 byte. Hence c is now pointing at the second byte within A[1], or as a flattened array:

C8 00 00 00 90 01 00 00 2D 01 00 00 2E 01 00 00 XX XX XX
[   A[0]  ] [   A[1]  ] [   A[2]  ] [  A[3]   ]
               [   C[0]  ] [   C[1]  ] [  C[2]   ]

Then this statement happens.

*c = 500;

That going to convert the byte values at C[0] which is currently 01 00 00 2D to be 500 which in little endian hex is F4 01 00 00. That adjusts memory values as follows:

               | CHANGED |
C8 00 00 00 90 F4 01 00 00 01 00 00 2E 01 00 00 XX
[   A[0]  ] [   A[1]  ] [   A[2]  ] [  A[3]   ]
               [   C[0]  ] [   C[1]  ] [  C[2]   ]

As a result, the 2D value that was occupying the first byte of A[2] is now 00.

So A[2] contains 00 01 00 00 which in little endian converts to 0x00000100, which is 256.

There's some rule breaking going on with the above code, but code doing pointer math like above isn't uncommon in the C world and is reliably implemented by compilers. Note, there are certain architectures, like Sparc, where writing to an integer pointer that is not word aligned on a 4-byte boundary will generate an exception (SIGBUS signal leading to program crash). Somewhere between the OS and the CPU architecture for Intel/Windows, this is handled for you (albeit slower).

Upvotes: 7

David C. Rankin
David C. Rankin

Reputation: 84521

Another way to look at it is in binary (little-endian). After Step: 4 your array in memory is:

a[0] : 11001000-00000000-00000000-00000000
a[1] : 10010000-00000001-00000000-00000000
a[2] : 00101101-00000001-00000000-00000000
a[3] : 00101110-00000001-00000000-00000000

At this point c points to a[1]. The expression:

c = (int *) ((char *) c + 1);

Adds one BYTE (sizeof(char) == 1), so when you assign *c = 500; you are assigning that to the 2nd byte of a[1], and since arrays are contiguous in memory, the last byte of 500 overwrites the first byte in a[2].

A visual descriptions of what is taking place would be:

a[1] : 10010000-00000001-00000000-00000000
                ^
   c = (int *) ((char *) c + 1);
   
a[1] : 10010000-00000001-00000000-00000000
                +   
500  :          11110100-00000001-00000000-00000000
                                           ^
                                   overwrites LSB of a[2]

This leaves the state of your array after the assignment as:

a[0] : 11001000-00000000-00000000-00000000
a[1] : 10010000-11110100-00000001-00000000
a[2] : 00000000-00000001-00000000-00000000
a[3] : 00101110-00000001-00000000-00000000

Where the values of a[1] and a[2] are:

a[1] : 11111010010010000  (128,144)
a[2] : 100000000              (256)

Let me know if you have any questions.

Upvotes: 1

Related Questions