Anycorn
Anycorn

Reputation: 51465

why optimization does not happen?

I have C/C++ code, that looks like this:

static int function(double *I) {
    int n = 0;
    // more instructions, loops,
    for (int i; ...; ++i)
        n += fabs(I[i] > tolerance);
    return n;
}

function(I); // return value is not used.

compiler inlines function, however it does not optimize out n manipulations. I would expect compiler is able to recognize that value is never used as rhs only. Is there some side effect, which prevents optimization?

compiler does not seem to matter, I tried Intel and gcc. Aggressive optimization, -O3

Thanks

fuller code (full code is repetition of such blocks):

  280         // function registers
  281         double q0 = 0.0;
  282         double q1 = 0.0;
  283         double q2 = 0.0;
  284
  285 #if defined (__INTEL_COMPILER)
  286 #pragma vector aligned
  287 #endif // alignment attribute
  288         for (int a = 0; a < int(N); ++a) {
  289             q0 += Ix(a,1,0)*Iy(a,0,0)*Iz(a,0,0);
  290             q1 += Ix(a,0,0)*Iy(a,1,0)*Iz(a,0,0);
  291             q2 += Ix(a,0,0)*Iy(a,0,0)*Iz(a,1,0);
  292         }
  293 #endif // not SSE
  294
  295         //contraction coefficients
  296         qK0 += q0*C[k+0];
  297         qK1 += q1*C[k+0];
  298         qK2 += q2*C[k+0];
  299
  300         Ix += 3*dim2d;
  301         Iy += 3*dim2d;
  302         Iz += 3*dim2d;
  303
  304     }
  305     Ix = Ix - 3*dim2d*K;
  306     Iy = Iy - 3*dim2d*K;
  307     Iz = Iz - 3*dim2d*K;
  308
  309     // normalization, scaling, and storage
  310     if(normalize) {
  311         I[0] = scale*NORMALIZE[1]*NORMALIZE[0]*(qK0 + I[0]);
  312         num += (fabs(I[0]) >= tol);
  313         I[1] = scale*NORMALIZE[2]*NORMALIZE[0]*(qK1 + I[1]);
  314         num += (fabs(I[1]) >= tol);
  315         I[2] = scale*NORMALIZE[3]*NORMALIZE[0]*(qK2 + I[2]);
  316         num += (fabs(I[2]) >= tol);
  317     }
  318     else {
  319         I[0] = scale*(qK0 + I[0]);
  320         num += (fabs(I[0]) >= tol);
  321         I[1] = scale*(qK1 + I[1]);
  322         num += (fabs(I[1]) >= tol);
  323         I[2] = scale*(qK2 + I[2]);
  324         num += (fabs(I[2]) >= tol);
  325     }
  326
  327
  328     return num;

my only guess is potentially floating-point exceptions, which introduced side effects

Upvotes: 2

Views: 592

Answers (4)

old_timer
old_timer

Reputation: 71536

Despite arguments I have had in other threads where all compilers are perfect and never miss an optimization. Compilers are not perfect and dont often catch optimizations.

Fun ones like this:

int fun ( int a )
{
   switch(a&3)
   {
      case 0: return(a+4); 
      case 1: return(a+2);
      case 2: return(a);
      case 3: return(0);
   }
   return(1);
}

For the longest time if you left that return at the end out you would get an error that the function defined a return type but failed to return a value. Some compilers would complain with the return() at the end of the function and complain without it.

From what I can tell from gcc vs say llvm, gcc optimizes within a function within a file where llvm optimizes across all of what it is fed. And you can join the bytecode for the entire project into one file and optimize the whole thing in one shot. At the moment gcc output outperforms llvm by a dozen or more percent which is interesting. give it time.

Perhaps in your case you are computing using two inputs that are not declared as static (const) so the result n could change. If optimized on a per-function basis it cannot reduce it further. So my guess is it is optimizing per function and the calling function doesnt know what affect the dynamic input I has on the system, even if the return value were not used it would still need to compute function(I) to resolve whatever is dependent on I. I assume this is not an infinite loop, the ... means some limiting is imposed? If not here again dynamic not static, function(I) could be a terminating infinite loop function or it could be there waiting for an interrupt service routine to modify I and kick it out of the infinite loop.

Upvotes: 0

Peter Ruderman
Peter Ruderman

Reputation: 12485

I think the short answer to this question is, just because a compiler can make some optimization in theory doesn't mean that it will. Nothing comes for free. If the compiler is going to optimize away n, then someone has to write the code to do it.

That sounds like a lot of work for something that is both a bizarre corner case and a trivial space savings. I mean, how often do people write functions that perform complex calculations only to discard the result? Is it worth writing complex optimizations to recover 8 bytes worth of stack space in such cases?

Upvotes: 2

Evan Teran
Evan Teran

Reputation: 90432

I can't say for certain that it will have an effect, but you may want to look into GCC's pure and const attribute (http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html). It basically tells the compiler that the function only operates on its input and has no side effects.

Given this extra information, it may be able to determine that the call is unnecessary.

Upvotes: 1

Blindy
Blindy

Reputation: 67380

The code does use n, first when it initializes it to 0 and then inside the loop on the left hand side of a function with possible side effects (fabs).

Whether or not you actually use the return of the function is irrelevant, n itself is used.

Update: I tried this code in MSVC10 and it optimized the whole function away. Give me a full example I could try.

#include <iostream>
#include <math.h>

const int tolerance=10;

static int function(double *I) {
    int n = 0;
    // more instructions, loops,
    for (int i=0; i<5; ++i)
        n += fabs((double)(I[i] > tolerance));
    return n;
}


int main()
{
    double I[]={1,2,3,4,5};

    function(I); // return value is not use
}

Upvotes: 7

Related Questions