Reputation: 65
Suppose that a cache line with variable X is simultaneously uploaded to L1d of CPU0 and L1d of CPU1. After changing the value of X from CPU0, when CPU1's L1d cache line is invalidated, Is it impossible for CPU1 to copy the variable X from CPU0's L1d cache if CPU0 has a cache line with X? And even if this is not the case, I want to know if there are cases where CPU0 brings in CPU1'
Upvotes: 1
Views: 869
Reputation: 2236
The case described is not allowed. When a processor core executes a store to an address, the data is written to a "store buffer" for transfer to the cache at a later time. Before transferring data from the store buffer, the cache requires Exclusive access to the line -- a state that can exist in only one cache at a time.
Three easy cases:
If two cores execute store instructions "at the same time", the details of the implementation will result in one of two the cores obtaining exclusive access. The other core will have its request "rejected" (NACK'd), and it will retry the request until the first core+cache has completed its upgrade of the cache line state and update of the data. This mechanism forces all stores to a single address to be processed sequentially, even if they are issued concurrently.
In general it is not possible for a user to reliably make something happen "at the same time" in two cores (or to detect whether it happened at the same time), but the implementations have to account for it by the serialization process described above.
Upvotes: 3
Reputation: 363999
How would you copy from an L1 whose copy has been invalidated? It no longer has a copy of the line.
But anyway, pretty sure the first place that gets checked after an L1d miss is the local L2, then shared L3.
On a Skylake-server or later (thus non-inclusive L3), I think an L3 miss would get reloaded from DRAM, unless the line was in Modified state in another core.
Otherwise, on client chips and earlier Xeons, an L3 miss is impossible if any core has a valid copy, because it's inclusive. (Really old chips, before Nehalem, also didn't have inclusive last-level cache, e.g. Core2's L2)
See also Which cache mapping technique is used in intel core i7 processor?
When you say "If we invalidate the cache in one core", I'm not sure if you just mean having it evicted from that cache, e.g. to make room for something else, or if you mean running an instruction like clflush
. Or if you mean that core did a store and thus had to do a Read For Ownership (RFO) to get MESI exclusive ownership of the line (i.e. invalidate all other copies) so it can commit a store from the store buffer to L1d.
Upvotes: 1