FortCpp
FortCpp

Reputation: 946

Improve the performance of a sum (C version)

I am using a scientific calculation code. And I want to improve it a little bit if possible. I check the code with Amplifier. The most time consuming (heavily used) code is this:

double a = 0.0;
for(j = 0; j < n; j++) a += w[j]*fi[((index[j] + i)<<ldf) + k];

To me it is just a dot product of w and fi. I am wondering:

  1. Does Intel compiler will do it automatically? (I mean treated the loop as the dot product of two vecterized array.)
  2. Is there a way to improve the code? (I mean maybe define another array a1 the same size of w. Then all multiplied number can be stored in a1 (unrolled loop?). Do summation in the end. )
  3. Other suggestions?

I am using parallel composer 2013 with visual studio. Any idea will be appreicated!:)

Upvotes: 2

Views: 146

Answers (1)

paddy
paddy

Reputation: 63471

You could start by noticing that you always offset by a fixed amount k in your fi array... I'm assuming it's of type double*. So why not just offset by k once before you loop?

double *fik = fi + k;

In fact, you do the same with i. The value (index[j] + i) << ldf is equivalent to (index[j] << ldf) + (i << ldf). So, you get:

double *fik = fi + k + (i << ldf);
double a = 0.0;
for(j = 0; j < n; j++) a += w[j] * fik[ index[j]<<ldf ];

Should be a little faster, unless the compiler has already decided to do that for you.

Upvotes: 2

Related Questions