Yuxai
Yuxai

Reputation: 79

Can I replace the atomic with a volatile in an one-reader/one-writer queue?

Let us consider the following one-reader/one-writer queue implemented with a linked list.

struct queue {
  queue() {
    tail = head = &reserved;
    n = 0;
  }
  void push(item *it) {
     tail->next = it;
     tail = it;
     n++;
  }
  item* pop() {
     while (n == used);
     ++used;
     head = head->next;
     return head;
  }

  item reserved;
  item *tail, *head;

  int used = 0;
  std::atomic <int> n;
}

Now I find that using volatile int n could make my writer run faster, while I'm not sure whether it guarantees that head = head->next can always read the correct value.

UPDATED: What if adding an atomic operation between tail->next, n++, i.e.,

void push(item *it) {
   tail->next = it;
   tail = it;
   a.store(0, std::memory_order_release);
   a.load(std::memory_order_acquire);
   n++;
}

in which a is never accessed by the reader? Will this guarantees the order of tail->next = it and head = head->next? (Still, it runs faster than using atomic n)

Upvotes: 1

Views: 283

Answers (2)

mpoeter
mpoeter

Reputation: 2949

As already pointed out in several other comments, volatile is unrelated to multithreading, so it should not be used here. However, the reason why volatile performs better then an atmoic simply is, because with a volatile ++n translates to simple load, inc, store instructions, while with an atomic it translates to a more expensive lock xadd (assuming you compile for x86).

But since this is only a single-reader single-writer queue, you don't need expensive read-modify-write operations:

struct queue {
  queue() {
    tail = head = &reserved;
    n = 0;
  }
  void push(item *it) {
     tail->next = it;
     tail = it;
     auto new_n = n.load(std::memory_order_relaxed) + 1;
     n.store(new_n, std::memory_order_release);
  }
  item* pop() {
     while (n.load(std::memory_order_acquire) == used);
     ++used;
     head = head->next;
     return head;
  }

  item reserved;
  item *tail, *head;

  int used = 0;
  std::atomic <int> n;
}

This should perform equally good as a volatile version. If the acquire-load in pop "sees" the value written by the store-release in push, the two operations synchronize-with each other, thereby establishing the required happens-before relation.

Upvotes: 0

findall
findall

Reputation: 2193

The volatile keyword in C++ is not a construct to give variable read/write a guarantee to be so ordered as it looks in the code, in multi-threaded environments. So, in your code with the atomic template wrapped counter being made bare with the volatile keyword alone, increasing of the counter observed by a consumer thread does not guarantee that item::next has also been updated.

To achieve maximum performance with the guarantee, I think at least you have to insert a write barrier between updating head->next and the incremenet to the counter, e.g. by n.fetch_add(1, std::memory_order_release), and a read barrier just before fetching tail->next, like n.load(std::memory_order_acquire). I don't know CPU-arch specific details, though.

Upvotes: 2

Related Questions