Reputation: 5856
I've heard so many times that an optimizer may reorder your code that I'm starting to believe it. Are there any examples or typical cases where this might happen and how can I Avoid such a thing (eg I want a benchmark to be impervious to this)?
Upvotes: 1
Views: 1654
Reputation: 129524
There are LOTS of different kinds of "code-motion" (moving code around), and it's caused by lots of different parts of the optimisation process:
x = sin(y)
once or 1000 times without changing y, x
will have the same value, so no point in doing that inside a loop. So compiler moves it out. I'm sure I've missed some cases in the above list, but this is certainly some of the most common.
The compiler is perfectly within its rights to do this, as long as it doesn't have any "observable difference" (other than the time it takes to run and the number of instructions used - those "don't count" in observable differences when it comes to compilers)
There is very little you can do to avoid compiler from reordering your code - you can write code that ensures the order to some degree. So for example, we can have code like this:
{
int sum = 0;
for(i = 0; i < large_number; i++)
sum += i;
}
Now, since sum
isn't being used, the compiler can remove it. Adding some code that checks prints the sum would ensure that it's "used" according to the compiler.
Likewise:
for(i = 0; i < large_number; i++)
{
do_stuff();
}
if the compiler can figure out that do_stuff
doesn't actually change any global value, or similar, it will move code around to form this:
do_stuff();
for(i = 0; i < large_number; i++)
{
}
The compiler may also remove - in fact almost certainly will - the, now, empty loop so that it doesn't exist at all. [As mentioned in the comments: If do_stuff
doesn't actually change anything outside itself, it may also be removed, but the example I had in mind is where do_stuff
produces a result, but the result is the same each time]
(The above happens if you remove the printout of results in the Dhrystone benchmark for example, since some of the loops calculate values that are never used other than in the printout - this can lead to benchmark results that exceed the highest theoretical throughput of the processor by a factor of 10 or so - because the benchmark assumes the instructions necessary for the loop were actually there, and says it took X nominal operations to execute each iteration)
There is no easy way to ensure this doesn't happen, aside from ensuring that do_stuff
either updates some variable outside the function, or returns a value that is "used" (e.g. summing up, or something).
Another example of removing/omitting code is where you store values repeatedly to the same variable multiple times:
int x;
for(i = 0; i < large_number; i++)
x = i * i;
can be replaced with:
x = (large_number-1) * (large_number-1);
Sometimes, you can use volatile
to ensure that something REALLY happens, but in a benchmark, that CAN be detrimental, since the compiler also can't optimise code that it SHOULD optimise (if you are not careful with the how you use volatile
).
If you have some SPECIFIC code that you care particularly about, it would be best to post it (and compile it with several state of the art compilers, and see what they actually do with it).
[Note that moving code around is definitely not a BAD thing in general - I do want my compiler (whether it is the one I'm writing myself, or one that I'm using that was written by someone else) to make optimisation by moving code, because, as long as it does so correctly, it will produce faster/better code by doing so!]
Upvotes: 5
Reputation: 308530
Most of the time, reordering is only allowed in situations where the observable effects of the program are the same - this means you shouldn't be able to tell.
Counterexamples do exist, for example the order of operands is unspecified and an optimizer is free to rearrange things. You can't predict the order of these two function calls for example:
int a = foo() + bar();
Read up on sequence points to see what guarantees are made.
Upvotes: 2