xeon111
xeon111

Reputation: 1045

Questions about cache

I have always wondered how i can control what is cached into the memory.

I always thought it was not possible to do with c++ atleast.

Until one day when a person told me not to include lua scripts in c++ application because it "...is notorious for completely ruining your cache...".

That got me thinking , is there anyway in c++ or any other compiled language to control what your program caches in memory.Because if lua can modify my cache performance, then why can't I.

If so,

i. Is it Architecture dependent or OS dependent ?

ii. Can you access what is in the cache or what is cached?.

Just to be Clear i am talking about CPU cache.

Upvotes: 1

Views: 437

Answers (3)

Shane Powell
Shane Powell

Reputation: 14158

The CPU cache is normally used for multiple independent caches. On most modern CPU's there are normally three caches:

  • instruction cache
  • data cache
  • translation lookaside buffer (TLB)

As yi_H says: you don't have direct control over it, but you you do have indirect control.

So there are multiple reasons to have poor cache performance. Common ones are:

  • Instruction working set is too large to fit into the instruction cache.
  • Data working set is too large to fit into the data cache.
  • Combination of the above.

This normally results in thrashing where the CPU is mainly sitting ideal waiting for data to process.

If you want to influence your CPU cache performance you need to reduce your instruction and data working sets to be as small as possible for each of you critical performance areas in your application not matter what OS/language your application is written in.

As to your questions:

i. Is it Architecture dependent or OS dependent ?

Yes

ii. Can you access what is in the cache or what is cached?

No

Upvotes: 1

Karoly Horvath
Karoly Horvath

Reputation: 96306

The CPU will cache all the data it needs and because its size is limited when it has to load something new it will drop anything that was the least recently used.

Basically you don't have direct control over it, but indirectly you have some:

What you have to know is that CPUs use cache lines. Each cache line is a small block of memory.

So if the CPU needs some data it will fetch the whole block. So, if you have some data that is very frequently used and would normally be scattered in the memory you can put it for example inside a struct so the the effective usage of the CPU cache will better (you cache less things that aren't really needed). Note: 99.99% of the time you don't need these kind of optimizations.

A more useful example is walking through a 2d array that doesn't fit into cache. If you walk it linearly you will load each cache line once, process it and some point later the CPU will drop it. If you use the indexes the wrong way each cache line will be loaded multiple times and because main memory access is slow, your code will be a lot slower. CPU can also do better prefetching if you walk linearly (direction doesn't matter).

Cache peformance can also be ruined by calling some external library which needs a lot of data and/or code so you main program+data will be dropped from the caches and when the call finishes the CPU has to load it again.

If you do heavy optimizations and want to know how you utilize the L1/L2/.. cache you can do simulations. Valgrind has an excellent module called Cachegrind which does exactly that.

Upvotes: 3

Oliver Charlesworth
Oliver Charlesworth

Reputation: 272687

On most platforms, no, you cannot directly control what gets cached. In general, whenever you read from some memory address, the content of that memory will get copied into the cache, unless the content you need is already in cache.

When they talk about "ruining your cache", what they really mean is "ruining your performance". Reading off-chip memory is slow (high latency); reading cache is fast (low latency). If you access memory in a stupid pattern, you will be constantly overwriting the contents of cache (i.e. "cache misses"), rather than re-using what's already in cache (i.e. "cache hits") and minimising reads from off-chip memory.

Upvotes: 0

Related Questions