Reputation: 3262
I am trying to simulate videocard (producer thread) and a monitor(consumer thread), to figure out what is going on in educational purposes. So here is the technical task description:
Producer thread produces frames pixel data at 1000 fps. Consumer thread runs at 60 fps and every frame it must have access to last produced frame for at least 1/60th of second. Each frame is represented by some int*
, for simplicity.
So my solution is that i have array of 2 pointers: one for producer, one for consumer. And plus some free, unused pointer, which is not owned by consumer or producer at any given moment of time.
#define Producer 0
#define Consumer 1
int* usedPointers[2];
std::atomic<int*> freePointer;
producer always writes frame pixels to usedPointers[Producer]
, then does usedPointers[Producer] = freePointer.exchange(usedPointers[Producer], memorySemanticsProducer);
so that last completely produced frame is now pointed by freePointer
, and its free to write new frame, not destroying last actual complete frame.
consumer does usedPointers[Consumer] = freePointer.exchange(usedPointers[Consumer], memorySemanticsConsumer);
so that it would own last actual frame data, and then is free to access usedPointers[Consumer]
as long, as it desires to.
Correct me if i am wrong.
I miss what is memorySemanticsXXX
. There are descriptions but i cannot figure out which exactly should i use in every thread and why. So i am asking for some hints on that.
Upvotes: 2
Views: 2015
Reputation: 3262
Here is a perfect answer that I would answer myself back then, if I could have time machine:
The choice of memory order depends on your specific use case and requirements. You should use the weakest memory order, that still ensures correctness and consistency of your algorithm. Typical choices include:
std::memory_order_seq_cst
if you want to avoid any surprises or complications, but you may pay a performance penalty for it.std::memory_order_relaxed
if you don’t care about ordering or synchronization at all, but you may introduce subtle bugs or race conditions if you do.std::memory_order_acquire
and std::memory_order_release
if you want to implement a producer-consumer pattern, where one thread writes data and releases it, and another thread reads data and acquires it.std::memory_order_acq_rel
if you want to implement a read-modify-write operation, where one thread reads data, modifies it, and writes it back atomically.Based on your exact description, it seems, that you want to implement a producer-consumer pattern, where the producer thread writes frame pixels to a buffer and the consumer thread reads them. You also want to avoid performance penalty by using weaker memory orders than sequential consistency. In that case, you can use std::memory_order_release
for memorySemanticsProducer
and std::memory_order_acquire
for memorySemanticsConsumer
. This way, you ensure that the producer thread does not write to the buffer before releasing the pointer, and that the consumer thread does not read from the buffer after acquiring the pointer. This also prevents any reordering of memory accesses around the exchange operation that could cause inconsistency or data races.
Here is a possible demo snippet of your code using these memory orders:
#define Producer 0
#define Consumer 1
int* usedPointers[2];
std::atomic<int*> freePointer;
void threadProducer() {
while(true) {
// write frame pixels to usedPointers[Producer]
// ...
// release the pointer and exchange it with freePointer
usedPointers[Producer] = freePointer.exchange(usedPointers[Producer], std::memory_order_release);
}
}
void threadConsumer() {
while(true) {
// acquire the pointer and exchange it with freePointer
usedPointers[Consumer] = freePointer.exchange(usedPointers[Consumer], std::memory_order_acquire);
// read frame pixels from usedPointers[Consumer]
// ...
}
}
int main() {
// initialize usedPointers and freePointer
usedPointers[Procuder] = init_buf();
usedPointers[Consumer] = init_buf();
freePointer.Store(init_buf());
// create producer and consumer threads
std::thread t1(threadProducer);
std::thread t2(threadConsumer);
// join threads
t1.join();
t2.join();
return 0;
}
moreover, instead of using global array of pointers - it may be better (may be easier for compiler/caching mechanisms, may be not) to define producer and consumer local buffer variable right inside corresponding thread functions themselves, but that may make buffer init code abit less beautiful - more scattered accross functions. Thich may be unwanted, if you plan to place functions across different files:
void threadConsumer() {
auto buffer = init_buf();
while(true) {
// acquire the consumer buffer pointer and exchange it with freePointer
buffer = freePointer.exchange(buffer, std::memory_order_acquire);
...
That will led you to better understanding, that triple buffer global state - consists of only one atomic pointer, which is shared buffer and a mechanism that is aware how to init it - two buffers may be fully thread-local
Upvotes: 2
Reputation: 1993
memorySemanticsXXX
you're mentioning are about the rest of your code surrounding the exchange()
lines. The default behavior for std::atomic::exchange()
is that memory_order_seq_cst
is used (the second parameter for exchange()
you're not using).
This means three things at the same time:
exchange()
call is guaranteed to execute before that call (otherwise compiler optimizations can reorder your code) and the results of that execution will be visible in all other threads (CPU cache propagation) before the exchange()
call is made.The same as previous but for the code you wrote after your exchange()
line.
All of the code before and after exchange()
call is executed in the exact order you wrote it (including other atomic operations).
So, the whole point is that you may choose not to have one, two or all three of these restrictions, which can bring you speed improvements. You shouldn't bother with this unless you have a performance bottleneck. If there's no bottleneck then just use std::atomic
without the second parameter (it will take the default value).
In case you don't use all three restrictions you have to be really careful writing your code otherwise it can unpredictably crash.
Read more about it here: Memory order
Upvotes: 2