Reputation: 22552
I have a program which spawns multiple threads that may write the exact same value to the exact same memory location:
std::vector<int> vec(32, 1); // Initialize vec with 32 times 1
std::vector<std::thread> threads;
for (int i = 0 ; i < 8 ; ++i) {
threads.emplace_back([&vec]() {
for (std::size_t j = 0 ; j < vec.size() ; ++j) {
vec[j] = 0;
}
});
}
for (auto& thrd: threads) {
thrd.join();
}
In this simplified code, all the threads may try to write the exact same value to the same memory location in vec
. Is this a data race likely to trigger undefined behavior, or is it safe since the values are never read before all the threads are joined again?
If there is a potentially hazardous data race, will using a std::vector<std::atomic<int>>
instead with std::memory_order_relaxed
stores instead be enough to prevent the data races?
Upvotes: 6
Views: 1099
Reputation: 40594
Implementation detail answer:
While the language standard classifies this as undefined behavior, you can actually feel quite safe as long as you are really writing the same data.
Why? The hardware sequentializes accesses to the same memory cell. The only thing that can go wrong is when several memory cells are written at the same time, because then you have no guarantee by the hardware that the accesses to several cells are sequentialized in the same way. For example, if one process writes 0x0000000000000000
, and another writes 0xffffffffffffffff
, your hardware may decide to sequentialize the accesses to the different bytes differently, resulting in something like 0x00000000ffffffff
.
However, if the data written by both processes is the same, then there is no noticeable difference between the two possible serializations, the result is deterministic.
Modern hardware does not handle memory accesses in a byte by byte fashion, instead, CPUs communicate with the main memory in terms of cache lines, and cores can usually communicate with their caches in terms of 8-byte words. As such, setting a properly aligned pointer is an atomic operation which can be relied on to implement lockfree algorithms. This has been exploited in the Linux kernel before more powerful atomic operations became available. C++ formalizes this in the form of atomic<>
types, adding support for the more high level hardware features like write after read, atomic increments and such.
But, of course, if you rely on your hardware details, you really should know what you are doing before you do it. Otherwise stick to language features like the atomic<>
types to ensure proper operations and avoid UB.
@Downvoters:
The question is not tagged [language-lawyer], and the answer explicitly states "Implementation detail answer". It was intentional to explain what the UB in the program will look like in real life. This answer has been written to complement the accepted answer (which has my upvote) with a different perspective on the question.
Upvotes: 1
Reputation: 9991
It is a data race and compilers will eventually become smart enough to miscompile the code if they are not already. See How to miscompile programs with "benign" data races section 2.4 for why writes of the same value break the code.
Upvotes: 4
Reputation: 39101
Language-lawyer answer, [intro.multithread] n3485
21 The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior.
4 Two expression evaluations conflict if one of them modifies a memory location and the other one accesses or modifies the same memory location.
will using a
std::vector<std::atomic<int>>
instead withstd::memory_order_relaxed
stores instead be enough to prevent the data races?
Yes. Those accesses are atomic, and there's a happens-before relationship introduced via the joining of the threads. Any subsequent read from the thread spawning those workers (which is synchronized via .join
) is safe and defined.
Upvotes: 5