Reputation: 571
This question has been bugging me for a while.
From what I understand that are various levels of storage. They are
With "fastest access time / fewest number" of at the top and "slowest access time / most number of" towards the bottom?
In C/C++ how do you control whether variables are put into (and stay in) Lower Level Cache? I'm assuming there is not a way to control which variables say in CPU registers since there are a very limited number.
I want to say that the C/C++ static keyword plays some part in it, but wanted to get clarification on this.
I understand how the static works in theory. Namely that
#include <stdio.h>
void increment(){
static int iSum = 0;
printf(" iSum = %d\n", ++iSum);
return;
}
void main(int argc, char* argv[]){
int iInc = 0;
for(iInc = 0; iInc < 5; iInc++)
increment();
return;
}
Would print
iSum = 1
iSum = 2
iSum = 3
iSum = 4
iSum = 5
But I am not certain how the different levels of storage play a part. Does where a variable lies depend more on the optimziation level such as through invoking the -o2 and -o3 flags on GCC?
Any insight would be greatly appreciated.
Thanks, Jeff
Upvotes: 1
Views: 252
Reputation: 21
RAM is split into different segments with different functionality.
When iSum is marked static, it is placed into static memory. This segment is initialized at compile time and hence the same location is memory is accessed with every call to the increment function.
Had iSum not been marked static, a new variable would be placed on the stack with every call to the local function.
This, however, has nothing to do with RAM vs. Cache vs. Registers etc. Static changes where in RAM iSum is placed. You can verify this by creating two static locals and two non-static locals and printing their addresses. You’ll note that the static locals are stored next to each other and so are the non-static locals. However the two pairs are stored miles apart.
Hope this helps.
Upvotes: 1
Reputation: 3917
I think there may be a few things that need clarification. CPU cache (L1, L2, L3, etc...) is a mechanism the CPU uses to avoid having to read and write directly to memory for values that will be accessed more frequently. It isn't distinct from RAM; it could be thought of as a narrow window of it.
Using cache effectively is extremely complex, and it requires nuanced knowledge of code memory access patterns, as well as the underlying architecture. You generally don't have any direct control over the cache mechanism, and an enormous amount of research has gone into compilers and CPUs to use CPU cache effectively. There are storage class specifiers, but these aren't meant to perform cache preload or support streaming.
Maybe it should be noted that simply because something takes fewer cycles to use (register, L1, L2, etc...) doesn't mean using it will necessarily make code faster. For example, if something is only written to memory once, loading it into L1 may cause a cache eviction, which could move data needed for a tight loop into a slower memory. Since the data that's accessed more frequently now takes more cycles to access, the cumulative impact would be lower (not higher) performance.
Upvotes: 0
Reputation: 764
Essentially you need to start looking at writing applications and code that are cache coherent. This is a quick intro to cache coherence:
http://supercomputingblog.com/optimization/taking-advantage-of-cache-coherence-in-your-programs/
Its a long and complicated subject and essentially boils down to actual implementation of algorithms along with the platform that they are targeting. There is a similar discussion in the following thread:
Can I force cache coherency on a multicore x86 CPU?
Upvotes: 1
Reputation: 66371
Caches are intermediate storage areas between main memory and registers.
They are used because accessing memory today is very expensive, measured in clock ticks, compared to how things used to be (memory access hasn't increased in speed anywhere near what's happened to CPUs).
So they are a way to "simulate" faster memory access while letting you write exactly the same code as without them.
Variables are never "stored" in the cache as such — their values are only held there temporarily in case the CPU needs them. Once modified, they are written out to their proper place in main memory (if they reside there and not in a register).
And static
has nothing to do with any of this.
If a program is small enough, the compiler can decide to use a register for that, too, or inline it to make it disappear completely.
Upvotes: 1
Reputation: 43662
The static
keyword has nothing to do with cache hinting and the compiler is free to allocate registers as it thinks suits better. You might have thought of that because of the storage class specifiers list with a deprecated register
specifier.
There's no way to precisely control via C++ (or C) standard-conformant language features how caching and/or register allocation work because you would have to deeply interface with your underlying hardware (and writing your own register allocator or hinting on how to store/spill/cache stuff). Register allocation is usually a compiler's back-end duty while caching stuff is processor's work (along with instruction pipelining, branch prediction and other low-level tasks).
It is true that changing the compiler's optimization level might deeply affect how variables are accessed/loaded into registers. Ideally you would keep everything into registers (they're fast) but since you can't (their size and number is limited) the compiler has to make some predictions and guess what should be spilled (i.e. taken out of a register and reloaded later) and what not (or even optimized-out). Register allocation is a NP-complete problem. In CUDA C you usually can't deal with such issues but you do have a chance of specifying the caching mechanism you intend to use by using different types of memory. However this is not standard C++ as extensions are in place.
Upvotes: 1
Reputation: 528
To answer this question:
In C/C++ how do you control whether variables are put into (and stay in) Lower Level Cache?
You can't. You can do some stuff to help the data stay in cache, but you can't pin anything in cache. It's not what those caches are for, they are mainly fed from the main memory, to speed up access, or allow for some advanced techniques like branch prediction and pipelining.
Upvotes: 0
Reputation: 27577
A function variable declared as static
makes it's lifetime that of the duration of the program. That's all C/C++ says about it, nothing about staorage/memory.
Upvotes: 0