123454321
123454321

Reputation: 131

Need clarification about unsigned char * in C

Given the code:

...

int x = 123

...

unsigned char * xx = (char *) & x;

...

I have xx[0] = 123, xx[1] = 0, xx[2] = 0, etc.

Can someone explain what is happening here? I dont have a great understanding of pointers in general, so the simpler the better.

Thanks

Upvotes: 3

Views: 285

Answers (4)

lurker
lurker

Reputation: 58244

I'll try to explain all the pieces in ASCII pictures.

int x = 123;

Here, x is the symbol representing a location of type int. Type int uses 4 bytes of memory on a 32-bit machine, or 8 bytes on a 64-bit machine. This can be compiler dependent as well. But for this discussion, let's assume 32-bits (4 bytes).

Memory on x86 is managed "little endian", meaning if a number requires multiple bytes (it's value is > 255 unsigned, or > 127 signed, single byte values), then the number is stored with the least significant byte in the lowest address. If your number were hexadecimal, 0x12345678, then it would be stored as:

x: 78        <-- address that `x` represents
   56        <-- x addr + 1 byte
   34        <-- x addr + 2 bytes
   12        <-- x addr + 3 bytes

Your number, decimal 123, is 7B hex, or 0000007B (all 4 bytes shown), so would look like:

x: 7B        <-- address that `x` represents
   00        <-- x addr + 1 byte
   00        <-- x addr + 2 bytes
   00        <-- x addr + 3 bytes

To make this clearer, let's make up a memory address for x, say, 0x00001000. Then the byte locations would have the following values:

    Address   Value
 x: 00001000  7B
    00001001  00
    00001002  00
    00001003  00

Now you have:

unsigned char * xx = (char *) & x;

Which defines a pointer to an unsigned char (an 8-bit, or 1-byte unsigned value, ranging 0-255) whose value is the address of your integer x. In other words, the value contained at location xx is 0x00001000.

xx:  00
     10
     00
     00

The ampersand (&) indicates you want the address of x. And, technically, the declaration isn't correct. It really should be cast properly as:

unsigned char * xx = (unsigned char *) & x;

So now you have a pointer, or address, stored in the variable xx. That address points to x:

    Address   Value
 x: 00001000  7B      <-- xx points HERE (xx has the value 0x00001000)
    00001001  00
    00001002  00
    00001003  00

The value of xx[0] is what xx points to offset by 0 bytes. It's offset by bytes because the type of xx is a pointer to an unsigned char which is one byte. Therefore, each offset count from xx is by the size of that type. The value of xx[1] is just one byte higher in memory, which is the value 00. And so on. Pictorially:

    Address   Value
 x: 00001000  7B      <-- xx[0], or the value at `xx` + 0
    00001001  00      <-- xx[1], or the value at `xx` + 1
    00001002  00      <-- xx[2], or the value at `xx` + 2
    00001003  00      <-- xx[3], or the value at `xx` + 3

Upvotes: 1

Paul Ogilvie
Paul Ogilvie

Reputation: 25286

unsigned char * xx = (char *) & x;

You take the address of x, you tell the compiler it is a pointer to a character[string], you assign that to xx, which is a pointer to a character[string]. The cast to (char *) just keeps the compiler happy.

Now if you print xx, or inspect it, it can depend on the machine what you see - the so-called little-endian ot big-endian way of storing integers. X86 is little endian and stores the bytes of the integer in reverse. So storing 0x00000123 will store 0x23 0x01 0x00 0x00, which is what you see when inspecting the location xx points to as characters.

Upvotes: 0

MooseBoys
MooseBoys

Reputation: 6793

You're accessing the bytes (chars) of a little-endian int in sequence. The number 123 in an int on a little-endian system will usually be stored as {123,0,0,0}. If your number had been 783 (256 * 3 + 15), it would be stored as {15,3,0,0}.

Upvotes: 3

Joseph Willcoxson
Joseph Willcoxson

Reputation: 6040

Yeah, you're doing something you shouldn't be doing...

That said... One part of the result is you're working on a little Endian processor. The int x = 123; statement allocates 4 bytes on the stack and intializes it with the value 123; Since it is little Endian, the memory looks like 123, 0, 0, 0 in memory. If it was big Endian, it would be 0, 0, 0, 123. Your char pointer is pointing to the first byte of memory where x is stored.

Upvotes: 0

Related Questions