Reputation: 5760
In a function that updates all particles I have the following code:
for (int i = 0; i < _maxParticles; i++)
{
// check if active
if (_particles[i].lifeTime > 0.0f)
{
_particles[i].lifeTime -= _decayRate * deltaTime;
}
}
This decreases the lifetime of the particle based on the time that passed. It gets calculated every loop, so if I've 10000 particles, that wouldn't be very efficient because it doesn't need to(it doesn't get changed anyways).
So I came up with this:
float lifeMin = _decayRate * deltaTime;
for (int i = 0; i < _maxParticles; i++)
{
// check if active
if (_particles[i].lifeTime > 0.0f)
{
_particles[i].lifeTime -= lifeMin;
}
}
This calculates it once and sets it to a variable that gets called every loop, so the CPU doesn't have to calculate it every loop, which would theoretically increase performance.
Would it run faster than the old code? Or does the release compiler do optimizations like this?
I wrote a program that compares both methods:
#include <time.h>
#include <iostream>
const unsigned int MAX = 1000000000;
int main()
{
float deltaTime = 20;
float decayRate = 200;
float foo = 2041.234f;
unsigned int start = clock();
for (unsigned int i = 0; i < MAX; i++)
{
foo -= decayRate * deltaTime;
}
std::cout << "Method 1 took " << clock() - start << "ms\n";
start = clock();
float calced = decayRate * deltaTime;
for (unsigned int i = 0; i < MAX; i++)
{
foo -= calced;
}
std::cout << "Method 2 took " << clock() - start << "ms\n";
int n;
std::cin >> n;
return 0;
}
Result in debug mode:
Method 1 took 2470ms
Method 2 took 2410ms
Result in release mode:
Method 1 took 0ms
Method 2 took 0ms
But that doesn't work. I know it doesn't do exactly the same, but it gives an idea. In debug mode, they take roughly the same time. Sometimes Method 1 is faster than Method 2(especially at fewer numbers), sometimes Method 2 is faster. In release mode, it takes 0 ms. A little weird.
I tried measuring it in the game itself, but there aren't enough particles to get a clear result.
EDIT
I tried to disable optimizations, and let the variables be user inputs using std::cin
.
Here are the results:
Method 1 took 2430ms
Method 2 took 2410ms
Upvotes: 1
Views: 192
Reputation: 153909
It will almost certainly make no difference what so ever, at least if you compile with optimization (and of course, if you're concerned with performance, you are compiling with optimization). The opimization in question is called loop invariant code motion, and is universally implemented (and has been for about 40 years).
On the other hand, it may make sense to use the separate variable anyway, to make the code clearer. This depends on the application, but in many cases, giving a name to the results of an expression can make code clearer. (In other cases, of course, throwing in a lot of extra variables can make it less clear. It's all depends on the application.) In any case, for such things, write the code as clearly as possible first, and then, if (and only if) there is a performance problem, profile to see where it is, and fix that.
EDIT:
Just to be perfectly clear: I'm talking about this sort of code optimization in general. In the exact case you show, since you don't use foo
, the compiler will probably remove it (and the loops) completely.
Upvotes: 4
Reputation: 3567
In theory, yes. But your loop is extremely simple and thus likeley to be heavily optimized.
Try the -O0
option to disable all compiler optimizations.
The release runtime might be caused by the compiler statically computing the result.
I am pretty confident that any decent compiler will replace your loops with the following code:
foo -= MAX * decayRate * deltaTime;
and
foo -= MAX * calced ;
You can make the MAX size depending on some kind of input (e.g. command line parameter) to avoid that.
Upvotes: 1