Reputation: 2594
Let's say I want to move a void* pointer by 4 bytes. Are the following equivalent:
A:
void* new_address(void* in_ptr) {
intptr_t tmp = (intptr_t)in_ptr;
intptr_t new_address = tmp + 4;
return (void*)new_address;
}
B:
void* new_address(void* in_ptr) {
char* tmp = (char*)in_ptr;
char* new_address = tmp + 4;
return (void*)new_address;
}
Are both defined behavior? Is one more popular/accepted convention? Any other reason to use one over the other?.
Let's only consider 64bit systems. If intptr_t is not available we can use int64_t instead.
The context is a custom memory allocator which needs to move the pointer before allocating new block of memory to a specific address (for alignment purposes). We don't know what object the resulting pointer is going to point to yet but we know we need to move it to a specific location which in the examples above is 4 bytes.
Upvotes: 3
Views: 1103
Reputation: 8589
No 'strictly conforming program uses A. Using the result may be Undefined Behaviour as there is no requirement for addition against intptr_t
to be reflected in a pointer value if that intptr_
is converted back to a pointer.
It is both unspecified behaviour and implementation-defined.
If the optional type intptr_t
is defined all you are guaranteed is that you can convert void *
to intptr_t
and then convert that value back to void *
and the two values will compare equal (==).
The strictly conforming
way to perform pointer arithmetic is B. B is guaranteed to work if and only if the pointer int_ptr
is valid and for the largest enclosing object there are 3 or more bytes in that object beyond that value. It's 3 because it's valid to point to (but not dereference) to the address that is (logically) one byte beyond the end of an object.
Object includes a declared object (including array) or block of memory such as returned by malloc()
.
All good practice is to prefer to write 'strictly conforming' programs where possible. So all good practice is to prefer B over A.
According to the standard the use of the pointer (as a pointer) may result in Undefined Behaviour because it may be (implementation defined) to be a trap representation.
A strictly conforming program is defined as "A strictly conforming program shall use only those features of the language and library specified in this International Standard.3) It shall not produce output dependent on any unspecified, undefined, or implementation-defined behavior, and shall not exceed any minimum implementation limit.
There's some disagreement about whether the code offered for A is unspecified or implementation defined. The standard says both because implementation-defined behaviour is a sub-category of unspecified. However because the implementation may document it as a trap representation using the value may result in Undefined Behaviour.
But I hope that is swept aside by the fact that 'strictly conforming programs' don't depend on unspecified, undefined or implementation defined behaviour. So good practice here is certainly B.
Consider a secure environment that encrypts pointer values to deliberately confound the de-referencing of arbitrary pointer values. In principle it could provide intptr_t
and be conformant.
Though I still maintain that if A doesn't work then intptr_t
being an optional type it would be better to not provide it. Whether it is defined is unspecified and implementation dependent. That's because no 'strictly conforming program' uses it and it has no practical use other than to manipulate a pointer as an arithmetic type in a way not supported by pointer arithmetic on a compatible pointer type char *
. The snippet in A falls into that category.
To store a void *
declare a void *
or char[sizeof(void*)] or malloc()
or similar. To overlay a void *
over an arithmetic type, declare a union
and benefit that the union
will be aligned for a void *
.
But according to the specification it is unspecified, implementation-defined no 'strictly conforming program' can rely on it and may result in Undefined Behaviour.
A very long winded way of saying the answer, here, is B.
Upvotes: 2
Reputation: 140970
Is casting a pointer to intptr_t [...] defined behavior?
Converting a pointer to any integer type is defined and the result is implementation defined, except when result can't be represented in integer type, then it's undefined behavior. See C11 6.3.2.3p6. But intptr_t
has to be able to represent void*
- the behavior is defined.
, doing arithmetic on it and then casting back, defined behavior?
Any integer may be converted to any pointer type. The resulting pointer is implementation defined - there is no guarantee that adding 4 to intptr_t
will increment the pointer value by 4
. See C11 6.3.2.3p5.
Are both defined behavior?
Yes, however the result is implementation defined.
Is one more popular/accepted convention?
Subjective: I say using uintptr_t
is more popular then intptr_t
. Converting a pointer to uintptr_t
or to char*
to do some arithmetic happens in some code, I can't say which is more popular.
Any other reason to use one over the other?.
Not really, but I think go with char*
.
When it comes to actually accessing the data behind the resulting pointer - it depends. If the resulting pointer points within the same object then you're fine (remember, conversion is implementation defined). If the resulting pointer does not point to the same object, I believe the best interpretation would be from reading c2263 Clarifying Pointer Provenance v4 2.2.3Q5 and I think that's: the current C11 standard does not clearly specify that, which would make the behavior not defined.
Because you tagged gcc
, both code snippets should compile to equivalent code - I believe on all architectures pointers are converted 1:1 to (u)intptr_t
on gcc. Gcc docs implementation defined behavior 4.7 arrays and pointers states casting from pointer to integer and back again, the resulting pointer must reference the same object as the original pointer, otherwise the behavior is undefined
- so you're safe as long as the resulting pointer points to the same object.
The context is a custom memory allocator
See implementations of container_of
and offsetof
macros. Do not hardcode + 4
in your code, and if you do, do not depend on alignment requirements on accessing the resulting pointers - remember to use memcpy
to safely copy the context or handle alignment properly. Do not reinvent the wheel - when in doubt see other implementations like glibc malloc.c or newlib malloc.c - they both calculate on char*
in mem2chunk
macro, but also happen to do calculations on uintptr_t
integers.
Upvotes: 2
Reputation: 20901
Michael Kerrisk says on page 1415 that,
The C standards make one exception to the rule that pointers of different types need not have the same representation: pointers of the types
char *
andvoid *
are required to have the same internal representation.
All the C standard guarantees (7.18.1.4) is that you can convert void*
values to intptr_t
(or uintptr_t
) and back again and end up with an equal value for the pointer.
The nuance is here that we cannot apply mathematical operations (including ==
) if void*
is in use.
Upvotes: 2