Reputation: 41
The following was an interview question:
Why might a fully associative cache have higher miss rates than a direct-mapped cache?
I thought this was not possible at all. Can anyone please share some insight on this?
Upvotes: 2
Views: 2997
Reputation: 365842
Are you supposed to assume they're the same size? If not, then a smaller fully associative cache can still miss more if most of the misses are "capacity" misses, not conflict misses.
If they are the same size, then it's a lot less likely, so you have to start thinking of pathological cases. An unlucky case for the replacement policy can lead to unrelated misses evicting a line that will be useful later. A miss in a fully associative cache can evict any of the current lines, and has to pick one. (And with a high associativity, LRU would take a lot of bits, so it's probably not going to be true LRU. Even if true LRU, you can certainly construct a sequence that still evicts more lines.)
Consider the following sequence of events:
There might be more-reasonable examples where the average hit rate isn't so low. e.g. simply looping over an array and thus benefiting from the cache when reading each byte or word sequentially.
An associative cache has much more room for other accesses to alias the same set because there are fewer sets. Usually the replacement policy makes useful choices for your workload, so this behaviour is only possible with one that defeats it.
Related: https://en.wikipedia.org/wiki/Adaptive_replacement_cache https://blog.stuffedcow.net/2013/01/ivb-cache-replacement/ - adaptive replacement can help keep some lines around when a cache-blasting loop over a huge array runs.
An adaptive replacement policy can make an associative cache more resistant to this kind of worst-case where a direct-mapped cache could beat it. Of course, looping over a huge array will usually completely wipe out a direct-mapped cache; what's special here is that we're avoiding anything that would alias the line that will hit. So looping through a linked list is more plausible for a sequence of accesses that touches a lot of lines but happens to skip one.
Also related re: pathological worst-cases for LRU: Bélády's anomaly for virtual memory page-replacement, where one can construct cases where LRU gets worse with more page-frames but FIFO doesn't. The analogous case for a CPU cache would be getting worse with more ways per set, for the same sequences of accesses to a set.
Of course nobody said anything about the fully-associative cache being true LRU, and as I said earlier that's unlikely in practice.
Upvotes: 1