Reputation: 1449
What is the view from C standard about pointer arithmetic result in pointer to another struct member via previous member address in the same struct?
int mystery_1(void)
{
int one = 1, two = 2;
int *p1 = &one + 1;
int *p2 = &two;
unsigned long i1 = (unsigned long) p1;
unsigned long i2 = (unsigned long) p2;
if (i1 == i2)
return p1 == p2;
return 2;
}
From code 1, I know that the result is not determined, because there is no guarantee how local variables on the stack lay.
What if I use struct like this (code 2)?
int mystery_2(void)
{
struct { int one, two; } my_var = {
.one = 1, .two = 2
};
int *p1 = &my_var.one + 1;
int *p2 = &my_var.two;
unsigned long i1 = (unsigned long) p1;
unsigned long i2 = (unsigned long) p2;
if (i1 == i2)
return p1 == p2;
return 2;
}
Godbolt link: https://godbolt.org/z/jGoKfETn7
mystery_1:
xorl %eax, %eax # return 0, while clang returns 2 (fine as no guarantee)
ret
mystery_2:
movl $1, %eax # return 1, as compiler must consider the memory order of struct members
ret
mystery_1: # @mystery_1
movl $2, %eax # return 2, while gcc returns 0 (fine as no guarantee)
retq
mystery_2: # @mystery_2
movl $1, %eax # return 1, as compiler must consider the memory order of struct members
retq
1
as p1 == p2
yields true, because struct guarantees the memory layout. So next address of my_var.one
is my_var.two
, and compiler is not allowed to assume that p1
and p2
is different because of their provenance.mystery_2
always return 1 as p1 == p2
yields true?mystery_2
, is compiler allowed to assume that p1 != p2
, so the function returns 0?I had a discussion with someone regarding the struct case (mystery_2
), they said that:
p1
points to (one past) one, and p2
points to two. Those are, in C spec, counted as different "objects". The spec then goes on to define that pointers to different objects might compare as different, even though both pointers have the exact same bit pattern
Upvotes: 1
Views: 82
Reputation: 222763
Two basics of pointer arithmetic are, per C 2018 6.5.6 8:
Therefore int *p1 = &one + 1;
has defined behavior.
Regarding:
unsigned long i1 = (unsigned long) p1;
unsigned long i2 = (unsigned long) p2;
Since it is not the focus of this question, let’s assume the implementation-defined conversion of a pointer to an unsigned long
produces a unique value that uniquely identifies the pointer value. (That is, conversion of any address to an unsigned long
only ever produces one value for that address, and conversion of the value back to a pointer reproduces the address. The C standard does not guarantee this.)
Then, if i1 == i2
, it implies p1 == p2
and vice-versa. Per C 2018 6.5.9 6,p1
and p2
can compare equal only if two
(which p2
points to) has been laid out in memory one beyond one
(which p1
points just beyond). (In general, pointers can compare equal for other reasons, but those cases involve pointers to the same object, a structure and its first member, the same function, and so on, all of which are ruled out for this particular p1
and p2
.)
So the code in Code 1 will return 1 if two
is laid out in memory just after one
and 2 otherwise.
The same is true in Code 2. The pointer arithmetic &my_var.one + 1
is defined, and the resulting p1
compares equal to p2
if and only if the member two
immediately follows the member one
in memory.
However, two
does not have to immediately follow one
. This statement is incorrect:
… struct guarantees the memory layout.
The C standard allows implementations to put padding between structure members. Common C implementations will not do this for struct { int one, two; }
because it is not needed for alignment (once one
is aligned, the address immediately following it is also suitably aligned for int
, so no padding is needed), but C standard does not guarantee it.
uintptr_t
, declared in <stdint.h>
, is a better choice for converting pointers to integers. However, the standard only guarantees that (uintptr_t) px == (uintptr_t) py
implies px == py
, not that px == py
implies (uintptr_t) px == (uintptr_t) py
. In other words, converting two pointers to the same object to uintptr_t
might produce two different values, although converting them back to pointers will result in pointers that compare as equal.
Upvotes: 1
Reputation: 121397
Is my understanding correct?
No.
You're correct about the local variables; but not for the struct example.
According to C standard, does mystery_2 always return 1 as p1 == p2 yields true?
No. That's not guaranteed by the C standard. Because there can be padding between one
and two
.
Practically, there's no reason for any compiler to insert padding between them in this example.
And you can nearly always expect mystery_2
to return 1. But this is not required by the C standard and thus a pathological compiler could insert padding between one
and two
and that'd be perfectly valid.
With respect to padding: The only guarantee is that there can't be any padding before the first member of a struct. So a pointer to a struct and a pointer to its first member are guaranteed to be the same. No other guarantees whatsoever.
Note: you should be using uinptr_t
for storing pointer values (unsigned long
isn't guaranteed to be able to hold a pointer value).
Upvotes: 1