Reputation: 253
I want to share a data struct between threads (gcc, Linux, x86). Let's say I have the following code in thread A:
shared_struct->a = 1;
shared_struct->b = 1;
shared_struct->enable = true;
Thread B is a periodic task that checks that struct first for the enable
flag.
I think that the compiler can reorder the writes in thread A, so thread B can see inconsistent data. I am familiar with memory barriers on ARM, but how do I ensure write ordering on x86? Is there a better way than volatile
?
I just want to set a consistent state in the struct, "flush" everything to memory and set an enable flag at the end.
Upvotes: 3
Views: 546
Reputation: 364039
If you only need to be able to set enable = true
, then stdatomic.h
with release / acquire ordering gives you exactly what you're asking for. (In x86 asm, normal stores/loads have release/acquire semantics, so yes blocking compile-time reordering is sort of sufficient. But the right way to do that is with atomic
, not volatile
.)
But if you want to be able to set enable = false
to "lock out" readers again while you modify it, then you need a more complicated update pattern. Either re-invent a mutex manually with atomics (bad idea; use an standard library mutex instead of that), or do something that allows wait-free read-only access by multiple readers when no writer is in the middle of an update.
Either RCU a or seqlock could be good here.
For a seqlock, instead of an enable = true/false flag, you have a sequence number. A reader can detect a "torn" write by checking the sequence number before and then again after reading the other members. (But then all the member have to be atomic
, using at least mo_relaxed
, otherwise it's data race undefined behaviour just from reading them in C, even if you discard the value. You also need sufficient ordering on the loads that check the counter. e.g. probably acquire on the first one, then acquire on the shared_struct->b
load to make sure the 2nd load of the sequence number is ordered after it. (acquire
is only a one-way barrier: an acquire load after a relaxed load wouldn't give you what you need.)
RCU makes the readers always completely wait-free; they just dereference a pointer to the currently-valid struct. Updates are as simple as atomically replacing a pointer. Recycling old structs is where it gets complicated: you have to be sure every reader thread is done reading a block of memory before you reuse it.
Simply setting enable = false
before changing the other struct members doesn't stop a reader from seeing enable == true
and then seeing inconsistent / partially-updated values for the other members while a writer is modifying them. If you don't need to do that, but only ever release new objects for access by other threads, then the sequence you describe is fine with atomic_store_explicit(&foo->enable, true, memory_order_release)
.
Upvotes: 2
Reputation: 1
You really should use a mutex (since you mention Pthread), so add a pthread_mutex_lock mtx;
field inside shared_struct
(don't forget to initialize it with pthread_mutex_init
) then
pthread_mutex_lock(&shared_struct->mtx);
shared_struct->a = 1;
shared_struct->b = 1;
shared_struct->enable = true;
pthread_mutex_unlock(&shared_struct->mtx);
and similarily in any other code accessing that shared data.
You might also look into atomic operations (but in your case, you'll better use a mutex like shown above).
Read some pthread tutorial.
Avoid race conditions and undefined behavior.
how do I ensure write ordering
You don't do that, unless you are implementing a thread library (and some parts of it should then be coded in assembler and use futex(7)), like nptl(7) implementation of pthreads(7) in GNU glibc (or musl-libc). You should use mutexes and you don't want to lose your time implementing a thread library (so use the existing ones).
Notice that most C standard libraries on Linux (including glibc & musl-libc) are free software, so you can study their source code (if you are curious to understand how Pthread mutexes are implemented, etc).
the compiler can reorder the writes
It is not mostly (and certainly not only) the compiler, but the hardware. Read about cache coherence. And the OS may also be involved (futex(2) sometimes called by pthread mutex routines).
Upvotes: 3