Reputation: 1399
I noticed this warning from Clang:
warning: performing pointer arithmetic on a null pointer
has undefined behavior [-Wnull-pointer-arithmetic]
In details, it is this code which triggers this warning:
int *start = ((int*)0);
int *end = ((int*)0) + count;
The constant literal zero converted to any pointer type decays into the null pointer constant, which does not point to any contiguous area of memory but still has the type pointer to type needed to do pointer arithmetic.
Why would arithmetic on a null pointer be forbidden when doing the same on a non-null pointer obtained from an integer different than zero does not trigger any warning?
And more importantly, does the C standard explicitly forbid null pointer arithmetic?
Also, this code will not trigger the warning, but this is because the pointer is not evaluated at compile time:
int *start = ((int*)0);
int *end = start + count;
But a good way of avoiding the undefined behavior is to explicitly cast an integer value to the pointer:
int *end = (int *)(sizeof(int) * count);
Upvotes: 11
Views: 4140
Reputation: 81247
When the C Standard was written, the vast majority of C implementations would, for any non-void*
pointer value p, uphold the invariants that p+0
and p-0
both yield p
, and p-p
will yield zero. More generally, operations like a size-zero memcpy
or fwrite
that operate on a buffer of size N would ignore the buffer address when N was zero. Such behavior would allow programmers to avoid having to write code to handle corner cases. For example, code to output a packet with an optional payload passed via address and length arguments would naturally process (NULL,0) as an empty payload.
Nothing in the published Rationale for the C Standard suggests that implementations whose target platforms would naturally behave in such fashion shouldn't continue to work as they always had. There were, however, a few platforms where it may have been expensive to uphold such behavioral guarantees in cases where p
is null.
As with most situations where the vast majority of C implementations would process a construct identically, but implementations might exist where such treatment would be impractical, the Standard characterizes the addition of zero to a null pointer as Undefined Behavior. The Standard allows implementations to, as a form of "conforming language extension", define the behavior of constructs in cases where it imposes no requirements, and it allow conforming (but not strictly conforming) programs to make use of them. According to the published Rationale, the stated intention was that support for such "popular extensions" be regarded as a "quality of implementation" issue to be decided by the marketplace. Implementations that could support them at essentially zero cost would do so, but implementations where such support would be expensive would be free to support such constructs or not based upon their customers' needs.
If one is using a compiler that targets commonplace platforms, and is designed to process the widest range of useful programs reasonably efficiently, then the extended semantics surrounding pointer arithmetic may allow one to write code more efficiently than would otherwise be possible. If one is targeting a compiler that does not value compatibility with quality compilers, however, one should recognize that it may treat the Standard's allowance for quirky hardware as an invitation to behave nonsensically even on commonplace hardware. Of course, one should also be aware that such compilers may behave nonsensically in corner cases where adherence with the Standard would require them to forego optimizations that are unsound but would "usually" be safe.
Upvotes: 2
Reputation: 170203
The C standard does not allow it.
6.5.6 Additive operators (emphasis mine)
8 When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i-n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.
For the purposes of the above, a pointer to a single object is considered as pointing into an array of 1 element.
Now, ((uint8_t*)0)
does not point at an element of an array object. Simply because a pointer holding a null pointer value does not point at any object. Which is said at:
6.3.2.3 Pointers
3 If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.
So you can't do arithmetic on it. The warning is justified, because as the second highlighted sentence mentions, we are in the case of undefined behavior.
Don't be fooled by the fact the offsetof
macro is possibly implemented like that. The standard library is not bound by the constraints placed on user programs. It can employ deeper knowledge. But doing this in our code is not well defined.
Upvotes: 20