Reputation: 51465
I have C/C++ code, that looks like this:
static int function(double *I) {
int n = 0;
// more instructions, loops,
for (int i; ...; ++i)
n += fabs(I[i] > tolerance);
return n;
}
function(I); // return value is not used.
compiler inlines function, however it does not optimize out n
manipulations.
I would expect compiler is able to recognize that value is never used as rhs only.
Is there some side effect, which prevents optimization?
compiler does not seem to matter, I tried Intel and gcc. Aggressive optimization, -O3
Thanks
fuller code (full code is repetition of such blocks):
280 // function registers
281 double q0 = 0.0;
282 double q1 = 0.0;
283 double q2 = 0.0;
284
285 #if defined (__INTEL_COMPILER)
286 #pragma vector aligned
287 #endif // alignment attribute
288 for (int a = 0; a < int(N); ++a) {
289 q0 += Ix(a,1,0)*Iy(a,0,0)*Iz(a,0,0);
290 q1 += Ix(a,0,0)*Iy(a,1,0)*Iz(a,0,0);
291 q2 += Ix(a,0,0)*Iy(a,0,0)*Iz(a,1,0);
292 }
293 #endif // not SSE
294
295 //contraction coefficients
296 qK0 += q0*C[k+0];
297 qK1 += q1*C[k+0];
298 qK2 += q2*C[k+0];
299
300 Ix += 3*dim2d;
301 Iy += 3*dim2d;
302 Iz += 3*dim2d;
303
304 }
305 Ix = Ix - 3*dim2d*K;
306 Iy = Iy - 3*dim2d*K;
307 Iz = Iz - 3*dim2d*K;
308
309 // normalization, scaling, and storage
310 if(normalize) {
311 I[0] = scale*NORMALIZE[1]*NORMALIZE[0]*(qK0 + I[0]);
312 num += (fabs(I[0]) >= tol);
313 I[1] = scale*NORMALIZE[2]*NORMALIZE[0]*(qK1 + I[1]);
314 num += (fabs(I[1]) >= tol);
315 I[2] = scale*NORMALIZE[3]*NORMALIZE[0]*(qK2 + I[2]);
316 num += (fabs(I[2]) >= tol);
317 }
318 else {
319 I[0] = scale*(qK0 + I[0]);
320 num += (fabs(I[0]) >= tol);
321 I[1] = scale*(qK1 + I[1]);
322 num += (fabs(I[1]) >= tol);
323 I[2] = scale*(qK2 + I[2]);
324 num += (fabs(I[2]) >= tol);
325 }
326
327
328 return num;
my only guess is potentially floating-point exceptions, which introduced side effects
Upvotes: 2
Views: 592
Reputation: 71536
Despite arguments I have had in other threads where all compilers are perfect and never miss an optimization. Compilers are not perfect and dont often catch optimizations.
Fun ones like this:
int fun ( int a ) { switch(a&3) { case 0: return(a+4); case 1: return(a+2); case 2: return(a); case 3: return(0); } return(1); }
For the longest time if you left that return at the end out you would get an error that the function defined a return type but failed to return a value. Some compilers would complain with the return() at the end of the function and complain without it.
From what I can tell from gcc vs say llvm, gcc optimizes within a function within a file where llvm optimizes across all of what it is fed. And you can join the bytecode for the entire project into one file and optimize the whole thing in one shot. At the moment gcc output outperforms llvm by a dozen or more percent which is interesting. give it time.
Perhaps in your case you are computing using two inputs that are not declared as static (const) so the result n could change. If optimized on a per-function basis it cannot reduce it further. So my guess is it is optimizing per function and the calling function doesnt know what affect the dynamic input I has on the system, even if the return value were not used it would still need to compute function(I) to resolve whatever is dependent on I. I assume this is not an infinite loop, the ... means some limiting is imposed? If not here again dynamic not static, function(I) could be a terminating infinite loop function or it could be there waiting for an interrupt service routine to modify I and kick it out of the infinite loop.
Upvotes: 0
Reputation: 12485
I think the short answer to this question is, just because a compiler can make some optimization in theory doesn't mean that it will. Nothing comes for free. If the compiler is going to optimize away n, then someone has to write the code to do it.
That sounds like a lot of work for something that is both a bizarre corner case and a trivial space savings. I mean, how often do people write functions that perform complex calculations only to discard the result? Is it worth writing complex optimizations to recover 8 bytes worth of stack space in such cases?
Upvotes: 2
Reputation: 90432
I can't say for certain that it will have an effect, but you may want to look into GCC's pure
and const
attribute (http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html). It basically tells the compiler that the function only operates on its input and has no side effects.
Given this extra information, it may be able to determine that the call is unnecessary.
Upvotes: 1
Reputation: 67380
The code does use n
, first when it initializes it to 0 and then inside the loop on the left hand side of a function with possible side effects (fabs
).
Whether or not you actually use the return of the function is irrelevant, n
itself is used.
Update: I tried this code in MSVC10 and it optimized the whole function away. Give me a full example I could try.
#include <iostream>
#include <math.h>
const int tolerance=10;
static int function(double *I) {
int n = 0;
// more instructions, loops,
for (int i=0; i<5; ++i)
n += fabs((double)(I[i] > tolerance));
return n;
}
int main()
{
double I[]={1,2,3,4,5};
function(I); // return value is not use
}
Upvotes: 7