antonpuz
antonpuz

Reputation: 3316

threads accessing same cache line

I came across a suggestion for threads not to access same cache lines and I really cant understand why, also while doing a search on that topic I came around with this questions: Multiple threads and CPU cache where one of the answers suggested:

you just want to avoid two threads from simultaneously trying to access data that is located on the same cache line

The way I see it, the cache stores pages of memory for quick access from the process, And as it says here:http://en.wikipedia.org/wiki/Thread_%28computing%29#How_threads_differ_from_processes

threads share their address space

It shouldn't be a problem for two threads to access same cache line since if a page is in the cache and a thread trying to access the memory will get a cache hit regardless of the other thread.

I heard the argument about avoiding threads from accessing same cache line on several different occasions so it cant be a myth. What am I missing here?

Upvotes: 6

Views: 4742

Answers (3)

rosepark222
rosepark222

Reputation: 89

This youtube clip might be helpful. The issue is when two processors writes to the same cache line, two caches have to maintain cache coherency. Imagine core 1 writes the data in the cache line and it placed the cache line in M state (in MESI protocol), while the core 2 cache line was I state. If core 2 writes to the same cache line, the line in core 2 cache will be M state forcing core 1 cache line to I state. In worst case, core 1 or core 2 cache line will be ping ponging between M and I state. Every time a cache line transition between M and I, the cache line should be read from other cache (I->M) or written to the external memory (flushing; M->I). This situation hurts performance due to the data exchange between caches and external memory accesses.

https://www.youtube.com/watch?v=S3kg_zCz_PA

The following code example was helpful to understand the situation where multiple threads accessing the same cache line.

https://www.geeksforgeeks.org/sum-array-using-pthreads/

Upvotes: 0

xmojmr
xmojmr

Reputation: 8145

The why not recommendation talks about speed optimization of the readers-writer problem when running on multi-core CPU

In that case if may be faster to avoid cache lock (LOCK# signal) and suppress cache line bouncing needed to maintain cache coherence by running readers/writer on different cache lines.

You are right that it is not a problem that must be avoided because something would fail to work. It is just one suggested speed optimization.

Thinking about internal processor caches is an extreme low-level speed optimization case. For most typical programming tasks the speed bottleneck lies outside the hardware circuits and following Intel Guide for Developing Multithreaded Applications is just enough


See also

Some ilustrations of the "cache lines" are available in the Intel® 64 and IA-32 Architectures Software Developer’s Manual

enter image description here

enter image description here

Upvotes: 5

kuroi neko
kuroi neko

Reputation: 8671

In most (probably all but I don't have an exhaustive hardware knowledge) multicore CPUs, the cache will lock the currently accessed line when one core tries to write into the corresponding memory. So other cores trying to access the same cache line will be held in wait.

You can share the same data among threads as long as it's read only (or infrequently updated), but if you keep writing into it, the hidden access serialization will yield performances equivalent to running all threads on the same core (actually a bit worse due to cache locking delays).

Upvotes: 5

Related Questions