whitebear
whitebear

Reputation: 65

Invalidation of the cache from L1 cache

Suppose that a cache line with variable X is simultaneously uploaded to L1d of CPU0 and L1d of CPU1. After changing the value of X from CPU0, when CPU1's L1d cache line is invalidated, Is it impossible for CPU1 to copy the variable X from CPU0's L1d cache if CPU0 has a cache line with X? And even if this is not the case, I want to know if there are cases where CPU0 brings in CPU1'

Upvotes: 1

Views: 869

Answers (2)

John D McCalpin
John D McCalpin

Reputation: 2236

The case described is not allowed. When a processor core executes a store to an address, the data is written to a "store buffer" for transfer to the cache at a later time. Before transferring data from the store buffer, the cache requires Exclusive access to the line -- a state that can exist in only one cache at a time.

Three easy cases:

  1. If the core's cache already has exclusive access (i.e., the line is in the Exclusive or Modifed states), then the store buffer can write the data to the cache at any time.
  2. If the core's cache has a valid copy of the line without exclusive access (such as the "Shared" state), the presence of new data in the store buffer will cause the cache to generate an "upgrade" request for the line. The upgrade to E or M state will not be granted until all other caches (or directories) acknowledge that they have invalidated any copies of that address.
  3. If the core's cache does not have a valid copy of the line (either no address match or an address match in the Invalid state), the cache will issue a "Read With Intent To Modify" request. This will result in the transfer of the current data for the cache line (whether in memory or from a modified copy in another core's cache) to the requesting core's cache, AND the invalidation of the cache line in every other cache in the system.

If two cores execute store instructions "at the same time", the details of the implementation will result in one of two the cores obtaining exclusive access. The other core will have its request "rejected" (NACK'd), and it will retry the request until the first core+cache has completed its upgrade of the cache line state and update of the data. This mechanism forces all stores to a single address to be processed sequentially, even if they are issued concurrently.

In general it is not possible for a user to reliably make something happen "at the same time" in two cores (or to detect whether it happened at the same time), but the implementations have to account for it by the serialization process described above.

Upvotes: 3

Peter Cordes
Peter Cordes

Reputation: 363999

How would you copy from an L1 whose copy has been invalidated? It no longer has a copy of the line.

But anyway, pretty sure the first place that gets checked after an L1d miss is the local L2, then shared L3.

On a Skylake-server or later (thus non-inclusive L3), I think an L3 miss would get reloaded from DRAM, unless the line was in Modified state in another core.

Otherwise, on client chips and earlier Xeons, an L3 miss is impossible if any core has a valid copy, because it's inclusive. (Really old chips, before Nehalem, also didn't have inclusive last-level cache, e.g. Core2's L2)

See also Which cache mapping technique is used in intel core i7 processor?


When you say "If we invalidate the cache in one core", I'm not sure if you just mean having it evicted from that cache, e.g. to make room for something else, or if you mean running an instruction like clflush. Or if you mean that core did a store and thus had to do a Read For Ownership (RFO) to get MESI exclusive ownership of the line (i.e. invalidate all other copies) so it can commit a store from the store buffer to L1d.

Upvotes: 1

Related Questions