Does atomic_load() with memory_order_relaxed introduce any additional overhead compared to simple reading from a variable?

Question

I don't see any reason for additional overhead for "native" CPU integrals, but I may be wrong, so I'd want to hear the comunity's opinion

My real problem concerns some kind of linked list that relatively rare changes but is offten read (similar to typical RCU use case). The idea is to provide 2 access modes for readonly operations: the first mode is used if the structure is now changing (full blown lock free algorithm) and the second lightweight mode for "calm" case (with non-atomic list traversal). For the second (lightweight) case I'm going to use atomic loads with memory_order_relaxed, but if it is too expensive, I need to do some workaround (cahche the atomic value in non-atomic variable, or emulate in some way proposed memory_order_nonatomic http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1446.htm , etc.)

I understand that the answer depends on atomics implementation (and CPU), but I hope the implementation should behave reasonably :)

Does atomic_load() with memory_order_relaxed introduce any additional overhead compared to simple reading from a variable?

Answers (1)

Related Questions