knivil
knivil

Reputation: 805

Which memory barriers do I need, to make the writes to image in thread A visible in Thread B?

Where do I need to put memory barriers? The writes to image in thread A should be visible in thread B? The spots are marked in the pseudo code example and are derived from this question/answer.

There is currently a discussion in our team to change the blocked wait into a lock free busy wait. Therefore should be ignored as a memory barrier.

Global:

_Atomic int request = 0;
_Atomic int reply = 0;
char image[640 * 640];

mutex_t mu;
cond_t  cv;

Thread A:

int last_req = 0
int curr_req = request;

if (curr_req != last_req)
{
    last_req = curr_req;
    char* buff = GetBufferFromCamera(.....);
    memcpy(image, buff, sizeof(image));
    atomic_thread_fence(memory_order_release);  // Okay?

    mtx_lock(&mu);
      reply = curr_req;
      cnd_signal(&cv);
    mtx_unlock(&mu);
}


Thread B:

int ticket = atomic_fetch_add(&request, 1);
ticket++;
mtx_lock(&mu);
  while (ticket != reply)
     cond_wait(&cv, &mu);
mtx_unlock(&mu)

atomic_thread_fence(memory_order_acquire); // Okay?

// I want to use image

As a side note: I am on arm8.

Upvotes: 0

Views: 84

Answers (1)

John Bollinger
John Bollinger

Reputation: 180048

STL;DR: probably none.

TL;DR: unlocking a mutex orders all of a thread's previous actions relative to subsequent locking of the same mutex, not just those actions performed while the mutex was locked.

There is currently a discussion in our team to change the blocked wait into a lock free busy wait. Therefore should be ignored as a memory barrier.

All conflicting accesses to shared memory need to be ordered relative to each other. If there are in fact any such accesses by different threads, then that requires some form of mutual exclusion. Fences are not usually sufficient, and separate fences are not usually necessary. Generally, the appropriate memory-ordering can and should be integrated into the mutual-exclusion mechanism.

The code presented in the question seems a bit incomplete. In particular, I don't see the "busy wait" part of your "lock free busy wait". I suppose the idea is that some or all of the Thread A code presented would spin inside a loop, and when it observes request to have been incremented, it fetches a new image and writes that into image.

I guess your concern is about image being updated by thread A without the mutex locked. That concern is probably unwarranted:

  1. The read of _Atomic variable request has sequential consistency memory semantics. Supposing the combination of that with the if condition is effective for providing the needed mutual exclusion for access to image, that also orders thread A's writes to image relative to other threads' previous accesses to image.

  2. Unlocking the mutex has release (or stronger) memory ordering semantics with respect to the mutex, so it orders A's writes to image (among other things) relative to accesses by other threads that subsequently lock the mutex (which operation has acquire semantics or stronger).

  3. That includes Thread B's proposed accesses following the code presented. Note here in particular that cond_wait() releases the mutex before blocking, and re-locks it before returning. Thus, either the initial mutex lock or the final re-lock will get you the memory ordering you need for image.

Thus, for the purposes you're asking about, and as far as I can determine from only the code you have presented, no explicit fences are needed in either thread.

Upvotes: 0

Related Questions