Reputation: 6425
I have a memory access pattern in my program like ...
b1->c1 (b and c are address.)
//.... do something else ....
b2->c2
//.... do something else ....
b3->c3
....
Is the compiler/cache/CPU smart enough to recognize that :
when I load b
, it should (prepare to) load corresponding c
?
More specifically : Can it somehow predict my access pattern and optimize it in some ways?
How much the advantage would be, roughly?
I created a test case. The result shows that it can't learn at run time.
(In real cases, B
has a lot of fields but tend to ->
only c
.)
class C{
public: int data=0;
};
class B{
public: C* c; int accu=0;
public: B(){
c=new C();
}
public: void doSomething(){
accu+=c->data; //do something about c
}
};
int main() {
using namespace std;
const int NUM=1000000;
B* bs[NUM];
for(int n=0;n<NUM;n++){
bs[n]=new B();
}
for(int loop=0;loop<20;loop++){
double accumulator=0;
for(int n=0;n<NUM;n++){
int iSecret = rand() % NUM;
clock_t begin = clock();
bs[iSecret]->doSomething();
clock_t end = clock();
accumulator+=double(end - begin);
}
double elapsed_secs = accumulator;
std::cout<<elapsed_secs<<std::endl;
}
}
Print (time per loop)
If it can learn, later loops should use less time than previous ones.
298749
306951
332946
...
337232
I don't think it can utilize Spatial locality, because c
's address is far away.
Upvotes: 0
Views: 32
Reputation: 2259
In your case bs[iSecret]
is one address which tries to access some other address c
via doSomething()
This is user level logic which user only can optimize by appropriately placing the data pointed by your b and c so as to take advantage of spatial locality.
As a simple example, would you expect compiler to optimize this code?
int a[100][100];
for(int i = 0; i < 100; ++i)
for(int j = 0; j < 100; ++j)
cout << a[j][i] << endl;
However, would it had been a case of conditional construct like
address X: if(condition)
{
address Y: //dosomething_A
}
else
{
address Z: //dosomething_B
}
Here, if
condition is at address X
and so on..
In such conditional constructs the compiler can generate code which can minimize the penalties of stall cycle(due to branch) on pipelined processors.
Also, pipelined processors can learn about your branches using Branch_predictor at runtime.
Upvotes: 1