peterh
peterh

Reputation: 1

gcc, __atomic_exchange seems to produce non-atomic asm, why?

I am working on a nice tool, which requires the atomic swap of two different 64-bit values. On the amd64 architecture it is possible with the XCHGQ instruction (see here in doc, warning: it is a long pdf).

Correspondingly, gcc has some atomic builtins which would ideally do the same, as it is visible for example here.

Using these 2 docs I produced the following simple C function, for the atomic swapping of two, 64-bit values:

void theExchange(u64* a, u64* b) {
  __atomic_exchange(a, b, b, __ATOMIC_SEQ_CST);
};

(Btw, it wasn't really clear to me, why needs an "atomic exchange" 3 operands.)

It was to me a little bit fishy, that the gcc __atomic_exchange macro uses 3 operands, so I tested its asm output. I compiled this with a gcc -O6 -masm=intel -S and I've got the following output:

.LHOTB0:
        .p2align 4,,15
        .globl  theExchange
        .type   theExchange, @function
theExchange:
.LFB16:
        .cfi_startproc
        mov     rax, QWORD PTR [rsi]
        xchg    rax, QWORD PTR [rdi] /* WTF? */
        mov     QWORD PTR [rsi], rax
        ret
        .cfi_endproc
.LFE16:
        .size   theExchange, .-theExchange
        .section        .text.unlikely

As we can see, the result function contains not only a single data move, but three different data movements. Thus, as I understood this asm code, this function won't be really atomic.

How is it possible? Maybe I misunderstood some of the docs? I admit, the gcc builtin doc wasn't really clear to me.

Upvotes: 0

Views: 2123

Answers (1)

Jester
Jester

Reputation: 58762

This is the generic version of __atomic_exchange_n (type *ptr, type val, int memorder) where only the exchange operation on ptr is atomic, the reading of val is not. In the generic version, val is accessed via pointer, but the atomicity still does not apply to it. The pointer is so that it will work with multiple sizes, when the compiler has to call an external helper:

The four non-arithmetic functions (load, store, exchange, and compare_exchange) all have a generic version as well. This generic version works on any data type. It uses the lock-free built-in function if the specific data type size makes that possible; otherwise, an external call is left to be resolved at run time. This external call is the same format with the addition of a ‘size_t’ parameter inserted as the first parameter indicating the size of the object being pointed to. All objects must be the same size.

Upvotes: 2

Related Questions