Reputation: 29
Taking the following piece of code
struct SharedMemory
{
std::atomic<float> value;
bool shouldReturnTrue()
{
float tmp = value.load(std::memory_order_relaxed);
float a = tmp;
float b = tmp;
return a == b;
}
};
Is it legal for the compiler to optimize it as follow and so break the thread safety ?
struct SharedMemory
{
std::atomic<float> value;
bool shouldReturnTrue()
{
float a = value.load(std::memory_order_relaxed);
float b = value.load(std::memory_order_relaxed);
return a == b;
}
};
If so, should I force it to not optimize tmp
using volatile
keyword as follow ?
struct SharedMemory
{
std::atomic<float> value;
bool shouldReturnTrue()
{
volatile float tmp = value.load(std::memory_order_relaxed);
float a = tmp;
float b = tmp;
return a == b;
}
};
Upvotes: 1
Views: 63
Reputation: 365277
No, that wouldn't be a legal optimization. As you say, it could return false for cases other than tmp
being NaN, if there was a writer. Therefore it wouldn't be valid per the as-if rule, because the C++ abstract machine does always return !isnan(tmp)
.
(If you'd picked a type like int
where self == self
was always true, it would be easier to talk about!)
The transformation you ask about is only legal for non-atomic variables, precisely because the value can be assumed not to be changing. If it is, that would be data-race undefined behaviour, and compilers are allowed to assume no UB. Thus they can invent another non-atomic non-volatile load if that's more convenient for register allocation.
But an atomic
variable does have to be assumed to change value between any two reads (in this way similar to volatile
in terms of what compilers have to assume about it), because that is well-defined behaviour that compilers have to respect.
So the compiler will never invent reads of atomic<>
objects. (That linked article is about Linux kernel code, where they use volatile
for hand-rolled atomics on compilers that are known to work the way they want, for the same reasons we use std::atomic<>
in modern C++.)
memory_order_relaxed
means that the ordering wrt. surrounding memory accesses is relaxed, not the atomicity guarantees. In this respect, relaxed
isn't different from acquire
or seq_cst
. I guess you had some misconception that relaxed
might be relevant here, but it isn't.
You're right that assigning the load result to volatile float tmp
would make it less likely for a compiler to want to invent loads, but if that was a real problem (it isn't), a better way would be to use volatile atomic<float> value;
. Again, this is not necessary. Especially now, since compilers currently don't optimize atomics at all (Why don't compilers merge redundant std::atomic writes?) but even in future with compilers that only give you the bare minimum ISO C++ standard and do optimize atomics by e.g. coalescing redundant back-to-back reads into one, they won't be allowed to do the other thing and invent extra writes, or invent extra reads and assume the value is the same.
Upvotes: 1