Reputation:
I have a question about the operators : address-of (&) and indirection (*) in C programming .
If
var
is a variable of type int
ptr
is a pointer to int
and is pointing to var
then what would be the value of ptr
?
Is the result indicating the base address of var
or the whole 4 bytes (in my platform)? if it is pointing the base address only then why *ptr
evaluates to the whole contains of var
? isn't it supposed to show the contains of the base address of var
only?
Upvotes: 1
Views: 632
Reputation: 12708
The internal value/representation of ptr
is, in general, unspecified. The specification says that only if you dereference it, the value has some sense. Everything you can be told about it, is that it is some information that allows the computer to find the value of the referenced type.
The compiler stores, with the pointer type, the referenced type for values of that pointer type. The idea is to be able to do pointer arithmetic (a C specific powerful idea) and to be able to use the values pointed to in expressions.
But the standard doesn't say anything about the internal representation of the actual pointer value.
This is specified so, to allow freedom for different architectures to implement address of values as best they can.
This makes it possible to implement e.g. in general Intel 32bit processors a pointer as a segment:offset
data type, which, as such, has not a valid interpretation as a single number. In a non-transparent virtual memory architecture, a pointer could require to specify the disk device, block number and offset in block, where the pointed value is stored, and so, using a disk device, block number, and an offset value can make difficult to interpret the whole thing as a number. One can argue that the segment:offset
value can be interpreted as a number (as it was in old 8086 processors, in with you could get the linear address by shifting left 4 bits the segment selector, then adding the offset). But if you do that in modern virtual memory architectures, you'll find that there's no easy way of computing the actual memory address if you consider that some of the info is hidden by the operating system. The segment
selector is only a descriptor to a possibly large amount of OS hidden meta-information (where the segment is located in linear address space, how much it extends, if you have permissions to deallocate the pointer as data or executable code, etc.)
Indeed, two different pointers can dereference the same final pointed data. Assume two different pointers that have been generated by different means, composed of different segment selectors and same or different offsets, but that map, after translation, to overlapping segments that finally point to the same physical memory location. In that case, there's no possibility even to compare such pointers, but when you access one the referenced values, you are actually getting the same data from both pointers. Dealing with this kind of pointers can make a pain for you, and force you to have to be very careful when doing pointer arithmetic with such pointers (read about far
pointers in old Microsoft compilers), but that's not an impediment to implement a C compiler on such an architecture.
Upvotes: 0
Reputation: 123558
It may help to remember that the type of the expression *ptr
is int
. That is, given the declarations
int var = 5;
int *ptr = &var;
then the following relationships are true:
ptr == &var // int * == int *
*ptr == var == 5 // int == int == int
Yes, the value of ptr
is the address of the first byte1 of var
. However, the expression *ptr
refers to the whole int
value stored in var
.
Pointers are an abstraction of a memory address with additional type semantics such that pointer operations work the same way on all types. If ptr
points to a 4-byte int
, then *ptr
evaluates to that 4-byte int
value. If it points to an 8-byte double
, then *ptr
evaluates to that 8-byte double
value.
Upvotes: 1
Reputation: 12679
Is the result indicating the base address of var or the whole 4 bytes (in my platform)? if it is pointing the base address only then why *ptr evaluates to the whole contains of var? isn't it supposed to show the contains of the base address of var only?
As you have mentioned, the platform you are using has 4
byte size int
.
var
is a variable of type int
:
100 101 102 103
var -------------------------
| | | | |
-------------------------
byte1 byte2 byte3 byte4
where, 100 - 104 are the address of byte1-byte4 respectively.
ptr
is a pointer to int
and is pointing to var
.
when you do int *ptr = &var;
, it means something like this:
ptr 100 101 102 103
-------- var -------------------------
| &var |-------->| | | | |
-------- -------------------------
byte1 byte2 byte3 byte4
The type of ptr
is int *
i.e. pointer to an integer. So, the type of *ptr
is int
. That means, when you dereference ptr
it gives the value at the address ptr
is pointing to and its type indicate the type of value which is int
in your case. Thats why, *ptr
evaluate to whole int
and not just the base address.
Note that if you do this
char *c_ptr = (char *)&var;
it changes the interpretation of address of var
when accessed using c_ptr
and *c_ptr
will be interpreted as char
. Though, the address at which ptr
and c_ptr
pointing to are numerically same.
Check this:
#include <stdio.h>
int main() {
int var = 50;
int *i_ptr = &var;
char *c_ptr = (char *)&var;
printf ("address of var: %p\n", (void *)&var);
printf ("i_ptr: %p\n", (void *)i_ptr);
printf ("i_char: %p\n\n", (void *)c_ptr);
printf ("value of var: %d\n", var);
printf ("value of *i_ptr: %d\n", *i_ptr);
for (size_t i = 0; i < sizeof(int); i++) {
printf ("Address of byte[%zu]: %p, ", i, (void *)&c_ptr[i]);
printf ("byte[%zu]: %c\n", i, c_ptr[i]);
}
return 0;
}
Output on little-endian architecture:
address of var: 0x7ffeea3ac9f8
i_ptr: 0x7ffeea3ac9f8 <========\
i_char: 0x7ffeea3ac9f8 <========/ the address pointing to is same
value of var: 50
value of *i_ptr: 50
Address of byte[0]: 0x7ffeea3ac9f8, byte[0]: 2 <========= 50th character of ascii
Address of byte[1]: 0x7ffeea3ac9f9, byte[1]:
Address of byte[2]: 0x7ffeea3ac9fa, byte[2]:
Address of byte[3]: 0x7ffeea3ac9fb, byte[3]:
Output on big-endian architecture:
address of var: ffbffbd0
i_ptr: ffbffbd0
i_char: ffbffbd0
value of var: 50
value of *i_ptr: 50
Address of byte[0]: ffbffbd0, byte[0]:
Address of byte[1]: ffbffbd1, byte[1]:
Address of byte[2]: ffbffbd2, byte[2]:
Address of byte[3]: ffbffbd3, byte[3]: 2
Upvotes: 0
Reputation: 134376
Like any other type, pointer is also a type.
A pointer, which is eligible to be dereferenced, has to point to a complete type. Each type has a defined (either pre or user) size, so the dereference takes into account of the size of the type of object a pointer points to.
Quoting C11
, chapter §6.5.3.2
The unary
*
operator denotes indirection. If the operand points to a function, the result is a function designator; if it points to an object, the result is an lvalue designating the object. If the operand has type ‘‘pointer to type’’, the result has type ‘‘type’’. If an invalid value has been assigned to the pointer, the behavior of the unary*
operator is undefined.
Let's see the below graphics to understand it better.
Say,
int x = 10;
int * px = &a;
and, in that platform, sizeof(int) == 4
.
Now, in memory, it would look like
+-------------------------------------------+
| |
| int x = 10; |
| |
+-------------------------------------------+
0x1000 0x1004
+------------------------------------------+
| |
| pointer * px = &x; (0x1000) |
| |
+------------------------------------------+
0x1000
to 0x1003
(upto beginning of address 0x1004
) is allocated for that variable x
of type int
.px
is the pointer, pointing to the start of that memory block. Also, while defining px
, we told compiler that it would point to an int
, so compiler knows, the memory block, which is stored at px
, has storage for 4 bytes, and to get the data during indirection (using the *
operator), we need to read all the 4 bytes and then return the result.Thus, while writing *px
, compiler reads all the 4 bytes and returns the value.
Upvotes: 0
Reputation: 224577
The pointer variable ptr
will contain the address where var
starts, i.e. the address of the first byte of var
. If you dereference ptr
as *ptr
you will get the value of var
.
Assuming a int
is 4 bytes, using *ptr
"knows" to read the next 4 bytes because of the type of the pointer. Since ptr
has type int *
this means *ptr
has type int
, so the next 4 bytes are read as an int
.
For example:
int var = 4;
int *ptr = &var;
printf("ptr = %p\n", (void *)ptr);
printf("*ptr = %d\n", *ptr);
printf("&var = %p\n", (void *)&var);
printf("var = %d\n", var);
Output:
ptr = 0x7ffc330b4484
*ptr = 4
&var = 0x7ffc330b4484
var = 4
Upvotes: 2
Reputation: 140796
ptr
, being an int *
, points to the whole int
, or as you put it, the whole sizeof(int)
bytes.
(unsigned char *)ptr
points to the "base address", as you put it.
ptr
and (unsigned char *)ptr
will have the same numeric value on all common CPU architectures, which demonstrates that the difference between pointing to the "whole" integer and pointing to just the "base address" is entirely a matter of what type the pointer has. It's vital that you understand that two variables with different types can still have the same numeric value.
Upvotes: 4