Reputation: 214780
I was having a discussion regarding using variables with indeterminate values leading to unspecified behavior, rather than undefined behavior, as discussed here. This assuming that a variable with automatic storage duration has its address taken and that trap representations do not apply.
In the specific case it was discussed what happens to ptr
after free(ptr)
, in which case C17 6.2.4 applies:
The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.
I made this example:
#include <stdlib.h>
#include <stdio.h>
int main (void)
{
int* ptr = malloc(sizeof *ptr);
int* garbage;
int*volatile* dummy = &garbage; // take the address
free(ptr);
puts("This should always print");
fflush(stdout);
if(ptr == garbage)
{
puts("Didn't see that one coming.");
}
else
{
puts("I expect this to happen");
}
puts("This should always print");
}
The argument I was making was that in theory, we can't know if ptr == garbage
is true or false since they are both indeterminate at that point. And so the compiler need not even read those memory locations - since it can deduct that both pointers hold indeterminate values, it is free to evaluate the expression to either true or false as it pleases during optimization. (In practice most compilers probably don't do that.)
I tried the code on x86_64 compilers gcc, icx and clang 14 -std=c17 -pedantic-errors -Wall -Wextra -O3
, in all cases I got the output:
This should always print
I expect this to happen
This should always print
However, in clang 15 specifically, I get:
This should always print
This should always print
Followed by error code 139 seg fault.
https://godbolt.org/z/E6xTzc156
If I comment out the "This should always print"/fflush
lines, clang 15 makes a dummy executable with the disassembly only consisting of a label:
main: # @main
Even though main() is containing several side effects.
Question:
Why does clang 15 behave differently than older versions/other compilers? Does it implement trap representations for pointers on the x86_64 I was playing around with or something similar?
Assuming there are no trap representations, none of this code should contain undefined behavior.
EDIT
Regarding how indeterminate values that are not trap representations should be expected to (not) behave, this has been discussed at length in DR 260 and DR 451. The Committee wouldn't be having these long and detailed discussions if the whole thing was to be dismissed as "it is undefined behavior".
Upvotes: 10
Views: 350
Reputation: 158599
Why is clang doing this? Clang is turning it is unreachable because it takes it as undefined behavior. We can turn it into an explicit trap using -mllvm -trap-unreachable
and if we try that with your example clang indeed generates a ud2
for us.
This is part of a larger discussion within the clang community which you can see part of in the discussion of Signed integer overflow causes program to skip the epilogue and fall into another function. Which discusses this issues for the signed overflow case and at the bottom we can see a linked discussion around infinite loops without forward progress.
I sympathize with your frustration that WG14 has seemed to discussion the issue of indeterminate values and there does seem to be some discussions about softening the impact using things such as "wobbly values". The recent C++ proposal Zero-initialize objects of automatic storage duration has this to say:
The WG14 C Standards Committee has had extensive discussions about "wobbly values" and "wobbly bits", specifically around [DR451] and [N1793], summarized in [Seacord].
The C Standards Committee has not reached a conclusion for C23, and wobbly bits continue to wobble indeterminately.
So while this has been discussed many times there is not yet consensus and if we further read the article referenced in that quote Uninitialized Reads: Understanding the proposed revisions to the C language it says:
According to the current WG14 Convener, David Keaton, reading an indeterminate value of any storage duration is implicit undefined behavior in C, and the description in Annex J.2 (which is non-normative) is incomplete. This revised definition of the undefined behavior might be stated as "The value of an object is read while it is indeterminate."
Unfortunately, there is no consensus in the committee or broader community concerning uninitialized reads.
So while there is a variety of ideas in this area they don't have a conclusion yet.
There are folks working to improve the situation but we are not there yet. There is also continued discussion within the compiler community about how aggressive we should be with various undefined behavior but again no conclusion there either.
Upvotes: 2