Leo Heinsaar
Leo Heinsaar

Reputation: 4047

What's the theory and measurements behind cache line sizes?

Cache lines are often 64 bytes, other sizes also exist.

My very simple question is: is there any theory behind this number, or is it just the result of the vast amount of tests and measurements that engineers behind it undoubtedly do?

Either way, I was wondering what those (the theory, if there is one, and kinds of tests behind the decision) are.

Upvotes: 7

Views: 2026

Answers (2)

jkang
jkang

Reputation: 549

The history of cache line sizes is a bit convoluted (as with many microarchitectural parameters). Originally, the cache line size was made to match the bus size of the processor. The thinking was that if a read or write was done on the bus, it might as well fill the data bus.

As caches got bigger, the sizes of cache lines increased for a few reasons:

  1. Take advantage of locality in certain cases.
  2. Indexing overhead can be kept low <--- this one is actually pretty important.

The larger the cacheline size, the fewer lines you need to keep track of inside the cache for an equivalently sized cache. For larger caches (multi-MB) this can reduce the lookup/compare times.

There are also some performance advantages (depending on the workload) to a larger cacheline size. But it's not entirely clear (let's take Spec2k17) that it's always a win. Sometimes a larger cacheline size introduces more waste since the program has low spacial locality.

Note that you don't need to have a single cache line size for all levels of cache. You can have 32B cache lines for the L1. 64B for the L2 and 128B for the L3/LLC if you wanted to. It's more work to keep track of partial lines but lets you utilize each level of cache effectively.

Upvotes: 0

Gabriel Southern
Gabriel Southern

Reputation: 10063

In general microarchitectural parameters tend to be tuned via performance modeling rather than some sort of theoretical model. That is to say there isn't anything like "big O" that is used to characterize the performance of algorithms. Instead benchmarks are run using performance simulators and this is used to guide the choice of the best parameters.

That having been said there are a few reasons why cache line size is going to be fairly stable in an established architecture:

  • Size is a power of 2: The line size should be a power of 2 in order to simplify addressing, so this limits the number of possible choices for cache line size.

  • Software is optimized based on cache parameters: Many microarchitectural parameters are completely hidden from the programmer. But the cache line size is one that is visible, and can have a significant impact on performance for some applications. Once programmers have optimized their code for a 64-byte cache line size then the processor architects have an incentive to keep this same cache line size in future processors, even if the underlying technology changed in a way that made a different size cache line easier to implement in hardware.

  • Cache coherence interacts with cache line: The verification of cache coherence protocols is extremely difficult, and cache coherence is a source of many bugs in processors. Coherence is tracked at the cache line level, so changing the cache line would require redoing all of the validation steps for a coherence protocol. So there would need to be a strong motivation for changing this parameters.

  • Changing cache line size could introduce false sharing: This is a special case of software being optimized based on cache parameters, but I think it is worth mentioning. Parallel programs are difficult to write in a way that actually provides performance benefits. Since data is tracked at the cache line granularity it is important to avoid false sharing. If the cache line size changed from one processor generation to another this could cause false sharing in the new processor that did not exist in the old one.

Although 64 bytes is the line size used for x86 and most ARM processors, there are other line sizes in use. For instance MIPS has many processors that have a 32 byte line size, and some that have 16 byte line size.

The line size is tuned to some degree to give the best performance for the workloads that the architecture is expected to run. However, once a line size is selected, and significant amounts of software have been written for the architecture, then the line size is unlikely to change in the future, for the reasons that I listed above.

Upvotes: 6

Related Questions