cong
cong

Reputation: 1187

Are memory barriers needed because of cpu out of order execution or because of cache consistency problem?

I'm wonderring why are memory barriers needed and I have read some articles about this toppic.
Someone says it's because of cpu out-of-order execution while others say it is because of cache consistency problems which store buffer and invalidate queue cause.
So, what's the real reason that memory barriers are needed? cpu out-of-order execution or cache consistency problems? or both? Does cpu out-of-order execution have something to do with cache consistency? and what's the difference between x86 and arm?

Upvotes: 2

Views: 1109

Answers (1)

Peter Cordes
Peter Cordes

Reputation: 364210

You need barriers to order this core / thread's accesses to globally-visible coherent cache when the ISA's memory ordering rules are weaker than the semantics you need for your algorithm.

Cache is always coherent, but that's a separate thing from consistency (ordering between multiple operations).

You can have memory reordering on an in-order CPU. In more detail, How is load->store reordering possible with in-order commit? shows how you can get memory reordering on a pipeline that starts executing instructions in program order, but with a cache that allows hit-under-miss and/or a store buffer allowing OoO commit.

Related:


See also https://preshing.com/20120710/memory-barriers-are-like-source-control-operations/ and https://preshing.com/20120930/weak-vs-strong-memory-models for some more basics. x86 has a "strong" memory ordering model: program order plus a store buffer with store-forwarding. C++ acquire and release are "free", only atomic RMWs and seq_cst stores need barriers.

ARM has a "weak" memory ordering model: only C++ memory_order_consume (data dependency ordering) is "free", acquire and release require special instructions (like ldar / stlr) or barriers.

Upvotes: 5

Related Questions