user11001001
user11001001

Reputation:

address-of (&) and indirection (*) in C

I have a question about the operators : address-of (&) and indirection (*) in C programming .

If

then what would be the value of ptr?

Is the result indicating the base address of var or the whole 4 bytes (in my platform)? if it is pointing the base address only then why *ptr evaluates to the whole contains of var? isn't it supposed to show the contains of the base address of var only?

Upvotes: 1

Views: 632

Answers (6)

Luis Colorado
Luis Colorado

Reputation: 12708

The internal value/representation of ptr is, in general, unspecified. The specification says that only if you dereference it, the value has some sense. Everything you can be told about it, is that it is some information that allows the computer to find the value of the referenced type.

The compiler stores, with the pointer type, the referenced type for values of that pointer type. The idea is to be able to do pointer arithmetic (a C specific powerful idea) and to be able to use the values pointed to in expressions.

But the standard doesn't say anything about the internal representation of the actual pointer value.

This is specified so, to allow freedom for different architectures to implement address of values as best they can.

This makes it possible to implement e.g. in general Intel 32bit processors a pointer as a segment:offset data type, which, as such, has not a valid interpretation as a single number. In a non-transparent virtual memory architecture, a pointer could require to specify the disk device, block number and offset in block, where the pointed value is stored, and so, using a disk device, block number, and an offset value can make difficult to interpret the whole thing as a number. One can argue that the segment:offset value can be interpreted as a number (as it was in old 8086 processors, in with you could get the linear address by shifting left 4 bits the segment selector, then adding the offset). But if you do that in modern virtual memory architectures, you'll find that there's no easy way of computing the actual memory address if you consider that some of the info is hidden by the operating system. The segment selector is only a descriptor to a possibly large amount of OS hidden meta-information (where the segment is located in linear address space, how much it extends, if you have permissions to deallocate the pointer as data or executable code, etc.)

Indeed, two different pointers can dereference the same final pointed data. Assume two different pointers that have been generated by different means, composed of different segment selectors and same or different offsets, but that map, after translation, to overlapping segments that finally point to the same physical memory location. In that case, there's no possibility even to compare such pointers, but when you access one the referenced values, you are actually getting the same data from both pointers. Dealing with this kind of pointers can make a pain for you, and force you to have to be very careful when doing pointer arithmetic with such pointers (read about far pointers in old Microsoft compilers), but that's not an impediment to implement a C compiler on such an architecture.

Upvotes: 0

John Bode
John Bode

Reputation: 123558

It may help to remember that the type of the expression *ptr is int. That is, given the declarations

int var = 5;
int *ptr = &var;

then the following relationships are true:

 ptr == &var       // int * == int *
*ptr ==  var == 5  // int   == int   == int

Yes, the value of ptr is the address of the first byte1 of var. However, the expression *ptr refers to the whole int value stored in var.

Pointers are an abstraction of a memory address with additional type semantics such that pointer operations work the same way on all types. If ptr points to a 4-byte int, then *ptr evaluates to that 4-byte int value. If it points to an 8-byte double, then *ptr evaluates to that 8-byte double value.


  1. Where the "first byte" may be either the most significant byte (big-endian architecture) or least significant byte (little-endian architecture). C hides the distinction behind the `int` type abstraction so you don't have to worry about it.

Upvotes: 1

H.S.
H.S.

Reputation: 12679

Is the result indicating the base address of var or the whole 4 bytes (in my platform)? if it is pointing the base address only then why *ptr evaluates to the whole contains of var? isn't it supposed to show the contains of the base address of var only?

As you have mentioned, the platform you are using has 4 byte size int.
var is a variable of type int:

    100   101   102   103
var -------------------------
    |     |     |     |     |
    -------------------------
     byte1 byte2 byte3 byte4

where, 100 - 104 are the address of byte1-byte4 respectively.

ptr is a pointer to int and is pointing to var.
when you do int *ptr = &var;, it means something like this:

   ptr              100   101   102   103
   --------     var -------------------------
   | &var |-------->|     |     |     |     |
   --------         -------------------------
                     byte1 byte2 byte3 byte4

The type of ptr is int * i.e. pointer to an integer. So, the type of *ptr is int. That means, when you dereference ptr it gives the value at the address ptr is pointing to and its type indicate the type of value which is int in your case. Thats why, *ptr evaluate to whole int and not just the base address.


Note that if you do this

char *c_ptr = (char *)&var;

it changes the interpretation of address of var when accessed using c_ptr and *c_ptr will be interpreted as char. Though, the address at which ptr and c_ptr pointing to are numerically same.

Check this:

#include <stdio.h>

int main() {
    int var = 50;
    int *i_ptr = &var;
    char *c_ptr = (char *)&var;

    printf ("address of var: %p\n", (void *)&var);
    printf ("i_ptr: %p\n", (void *)i_ptr);
    printf ("i_char: %p\n\n", (void *)c_ptr);
    printf ("value of var: %d\n", var);
    printf ("value of *i_ptr: %d\n", *i_ptr);

    for (size_t i = 0; i < sizeof(int); i++) {
        printf ("Address of byte[%zu]: %p, ", i, (void *)&c_ptr[i]);
        printf ("byte[%zu]: %c\n", i, c_ptr[i]);
    }
    return 0;
}

Output on little-endian architecture:

address of var: 0x7ffeea3ac9f8
i_ptr: 0x7ffeea3ac9f8          <========\
i_char: 0x7ffeea3ac9f8         <========/ the address pointing to is same

value of var: 50
value of *i_ptr: 50
Address of byte[0]: 0x7ffeea3ac9f8, byte[0]: 2     <========= 50th character of ascii
Address of byte[1]: 0x7ffeea3ac9f9, byte[1]: 
Address of byte[2]: 0x7ffeea3ac9fa, byte[2]: 
Address of byte[3]: 0x7ffeea3ac9fb, byte[3]: 

Output on big-endian architecture:

address of var: ffbffbd0
i_ptr: ffbffbd0
i_char: ffbffbd0

value of var: 50
value of *i_ptr: 50
Address of byte[0]: ffbffbd0, byte[0]: 
Address of byte[1]: ffbffbd1, byte[1]: 
Address of byte[2]: ffbffbd2, byte[2]: 
Address of byte[3]: ffbffbd3, byte[3]: 2

Upvotes: 0

Sourav Ghosh
Sourav Ghosh

Reputation: 134376

Like any other type, pointer is also a type.

A pointer, which is eligible to be dereferenced, has to point to a complete type. Each type has a defined (either pre or user) size, so the dereference takes into account of the size of the type of object a pointer points to.

Quoting C11, chapter §6.5.3.2

The unary * operator denotes indirection. If the operand points to a function, the result is a function designator; if it points to an object, the result is an lvalue designating the object. If the operand has type ‘‘pointer to type’’, the result has type ‘‘type’’. If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined.

Let's see the below graphics to understand it better.

Say,

int x = 10;
int * px = &a;

and, in that platform, sizeof(int) == 4.

Now, in memory, it would look like

          +-------------------------------------------+
          |                                           |
          |              int x  = 10;                 |
          |                                           |
          +-------------------------------------------+

      0x1000                                        0x1004


           +------------------------------------------+
           |                                          |
           |     pointer * px  = &x; (0x1000)         |
           |                                          |
           +------------------------------------------+
  • Now, the whole block of memory , 0x1000 to 0x1003 (upto beginning of address 0x1004) is allocated for that variable x of type int.
  • px is the pointer, pointing to the start of that memory block. Also, while defining px, we told compiler that it would point to an int, so compiler knows, the memory block, which is stored at px, has storage for 4 bytes, and to get the data during indirection (using the * operator), we need to read all the 4 bytes and then return the result.

Thus, while writing *px, compiler reads all the 4 bytes and returns the value.

Upvotes: 0

dbush
dbush

Reputation: 224577

The pointer variable ptr will contain the address where var starts, i.e. the address of the first byte of var. If you dereference ptr as *ptr you will get the value of var.

Assuming a int is 4 bytes, using *ptr "knows" to read the next 4 bytes because of the type of the pointer. Since ptr has type int * this means *ptr has type int, so the next 4 bytes are read as an int.

For example:

int var = 4;
int *ptr = &var;
printf("ptr = %p\n", (void *)ptr);
printf("*ptr = %d\n", *ptr);
printf("&var = %p\n", (void *)&var);
printf("var = %d\n", var);

Output:

ptr = 0x7ffc330b4484
*ptr = 4
&var = 0x7ffc330b4484
var = 4

Upvotes: 2

zwol
zwol

Reputation: 140796

ptr, being an int *, points to the whole int, or as you put it, the whole sizeof(int) bytes.

(unsigned char *)ptr points to the "base address", as you put it.

ptr and (unsigned char *)ptr will have the same numeric value on all common CPU architectures, which demonstrates that the difference between pointing to the "whole" integer and pointing to just the "base address" is entirely a matter of what type the pointer has. It's vital that you understand that two variables with different types can still have the same numeric value.

Upvotes: 4

Related Questions