Does gcc treat relaxed atomic operation as a Compiler-fence?

Question

I have following code with GCC8.3 ,x86-64 linux:

// file: inc.cc
int inc_value(int* x) {
  (*x)++;
  //std::atomic ww;
  //ww.load(std::memory_order_relaxed);
  (*x)++;
  return *x;
}

generates following assembly for increment operation:

 g++ -S inc.cc -o inc.s -O3
 addl  $2, %eax

afther I enable the use of atomic variable, I got:

  addl  $1, (%rdi)  
  movl  -4(%rsp), %eax
  movl  (%rdi), %eax
  addl  $1, %eax
  movl  %eax, (%rdi)

It seems that the Relaxed-atomic-load works like a Compiler-fence (just like asm volatile("":::"memory");), so GCC cannot reorder the instructions around it;

I have known:

cppreference says no memory-order guarantee around relaxed-operation, so it's ok for compiler to reorder
X86-64 has a strong TSO memory model; Atomic operations doesnt generate lock/xchg/mfence cpu-fence instructions, only work as a compiler-fence (except for seq_cst);

According above, GCC should reorder+optimize the code to add $2, %eax, just like relaxed-operation takes no effect; But the result shows that GCC takes the relaxed-load as a compiler-fence, stopps any reordering; So I have following question:

for x86-64, does GCC always generate a full compiler-fence for atomic operation? even though it's a relaxed operation; Besides, GCC generates mov-to-memory instead of mov-to-register instruction(doesnt cache the tmp value in register), does the 'atomic-compiler-fence' also implies memory side effect to GCC so GCC has to store/load values from memory around the fence?
If so, for x86-64, is it enough to use only 2 orders: relaxed-order and seq_cst? Since x86-64 has TSO guarantees, and relaxed-order is taken as a full compiler-fence, it can replace the usage of release/acquire/consume.

Does gcc treat relaxed atomic operation as a Compiler-fence?

Answers (1)

Related Questions