Why do I see such a huge performance difference between these functions?

Question

Compiling this function with g++ works, but is pretty slow.

void rota(double psir,double thetar,double phir,double xi,double yi,double zi,double *xf,double *yf,double *zf) {
  *xf = xi*cos(phir)*cos(psir)+yi*(-sin(phir)*cos(thetar)*cos(psir)+sin(thetar)*sin(psir))+zi*(sin(phir)*sin(thetar)*cos(psir)+cos(thetar)*sin(psir));
  *yf = xi*sin(phir)+yi*cos(phir)*cos(thetar)-zi*cos(phir)*sin(thetar);
  *zf = -xi*cos(phir)*sin(psir)+yi*(sin(thetar)*cos(psir)+cos(thetar)*sin(phir)*sin(psir))+zi*(cos(thetar)*cos(psir)-sin(thetar)*sin(phir)*sin(psir));
  return;
}

If I calculate the intermediate values once, then call those, my simulation runs much faster.

void rota(double psir,double thetar,double phir,double xi,double yi,double zi,double *xf,double *yf,double *zf) {
  double cosf = cos(phir);
  double sinf = sin(phir);
  double cosp = cos(psir);
  double sinp = sin(psir);
  double cost = cos(thetar);
  double sint = sin(thetar);
  *xf = xi*cosf*cosp+yi*(-sinf*cost*cosp+sint*sinp)+zi*(sinf*sint*cosp+cost*sinp);
  *yf = xi*sinf+yi*cosf*cost-zi*cosf*sint;
  *zf = -xi*cosf*sinp+yi*(sint*cosp+cost*sinf*sinp)+zi*(cost*cosp-sint*sinf*sinp);
  return;
}

Why doesn't g++ do this optimization for me? Is there a way for me to do this more efficiently?

Thanks!

NPE · Accepted Answer

I've compiled your code using gcc 4.7.2 with -O3. The generated x86_64 assembly was nearly identical in the two cases.

I have then benchmarked each function by calling it 100,000,000 times.

The first version took:

real    0m0.216s
user    0m0.213s
sys     0m0.002s

while the second took:

real    0m0.216s
user    0m0.212s
sys     0m0.002s

Draw your own conclusions.

Why do I see such a huge performance difference between these functions?

Answers (1)

Related Questions