Reputation: 2590

C11 memory model: Ensure that data is written between two atomics

I have two atomic ints and some other data. The ordering of the code execution should exactly be like this:

Write first atomic int
Write some data
Write second atomic int

Step 2 shall never be reordered (e.g. to before step 1).

Is the following code correct? If not, why not?

_Atomic uint32_t begin_id;
_Atomic uint32_t end_id;
int some_other_data = 0;

void bla()
{
    const uint32_t id = ...;
    atomic_store_explicit(&begin_id, id, memory_order_release);

    // ...writing some other data...
    some_other_data = f(x);

    atomic_store_explicit(&end_id, id, memory_order_release);
}

I'm not sure about the memory orderings.

Thanks

--- edit ---

Finally I changed to code to implement a seqlock (https://github.com/spotify/linux/blob/master/include/linux/seqlock.h). The main difference is that I don't use a loop when reading (so only trying once) and that I use atomic_thread_fence(memory_order_acquire) to ensure the data is only written after the sequence is increased the first time. Also no spinlock is needed.

Upvotes: 2

Answers (1)

John Bollinger

Reputation: 180201

Expanding on my comments, your scheme appears to be to

reduce the likelihood of a data race on the front end, then
check after reading whether there is a possibility that a data race occurred.

But this does not protect you from undefined behavior as a result of data races. Example:

the reader checks end_id and begin_id, finds that they match, and begins reading the non-atomic data
the writer updates begin_id, with this landing between the reader's first and second loads of begin_id, and begins updating the non-atomic data

There are now data races for all the items written by the writer and also read by the reader. The program's behavior is undefined, and that is not limited to (say) which values are / were observed by the reader.
eventually, the reader checks begin_id again, but that's meaningless. The behavior of this check is undefined. The behavior of any conditional branching based on the result is undefined. All of this is true even with memory_order_seq_cst.

That is, your scheme does not work, in the sense that it does not prevent data races, and that if a data race occurs then then the whole behavior of the whole program is undefined. This is not merely academic. In actual practice, clever compilers sometimes optimize in ways that rely on all bets being off in the event of UB. For example, by ignoring your reader's second read of begin_id, on the basis that it could differ from the first only in the event that the program execution has UB.

Alternatives

Atomic spinlock and non-blocking locking

Since you seem to expect that it will be rare for reader and writer to conflict, you could consider implementing an atomic spinlock as a light(er)weight alternative to a mutex. That might look something like this:

static atomic_flag spinlock = ATOMIC_FLAG_INIT;

/*
 * Spin until the spinlock is successfully acquired
 */
void spinlock_lock() {
    while (atomic_flag_test_and_set_explicit(&spinlock, memory_order_acq_rel)) {
        /* empty */
    }
}

/*
 * Try once to acquire the spinlock.  Return 1 if successful or 0 if not.
 */
_Bool spinlock_trylock() {
    return !atomic_flag_test_and_set_explicit(&spinlock, memory_order_acq_rel);
}

/*
 * Unlock the spinlock
 */
void spinlock_unlock() {
    atomic_flag_clear_explicit(&spinlock, memory_order_release);
}

The reader could use spinlock_trylock() to attempt to acquire the lock, but avoid waiting to do so. The writer would, presumably, use spinlock_lock() to wait for the reader to clear the critical region before writing. Both would release the lock when no longer needed via spinlock_unlock().

The memory ordering associated with the locking operations would have the additional effect of preventing races for data read or written while the lock was held, and the reader's efficiency would be improved by allowing it to skip reading at all when the writer was in the process of updating the data.

Multiple buffering

If you need to avoid the writer blocking while the reader reads, then you can do so by double- (or triple- or quadruple- ...) buffering the the data. Since there seem to be several items, you would define a structure type with a member for each one. You would then set up an atomic pointer that the writer would keep updated to point at the last-written version of the data. Reader and writer could then access the data concurrently, each working with its own copy. The writer would store the pointer with memory_order_release; the reader would load it with memory_order_acquire.

This is not enough by itself to avoid data races, but if you combine it with a spinlock (see above) associated with each copy of the data, then that will do it. The advantage of this over a single spinlock, alone, is that you can minimize the chance that the writer will ever have to wait to acquire a lock, and reduce the time that it is likely to have to wait in the event that it ever has to do so.