Reputation: 31108
I thought about this: Is there a performance difference in these two practices:
Assuming all classes and functions are written correctly.
ClassA a = function1();
ClassB b = function2(a);
function3(b);
function3(function2(function1()));
I know there aren't a big difference with only one run, but supposed that we could run this a lot of times in a loop, I created some tests.
#include <iostream>
#include <ctime>
#include <math.h>
using namespace std;
int main()
{
clock_t start = clock();
clock_t ends = clock();
// Case 1.
start = clock();
for (int i=0; i<10000000; i++)
{
double a = cos(1);
double b = pow(a, 2);
sqrt(b);
}
ends = clock();
cout << (double) (ends - start) / CLOCKS_PER_SEC << endl;
// Case 2.
start = clock();
for (int i=0; i<10000000; i++)
sqrt(pow(cos(1),2));
ends = clock();
cout << (double) (ends - start) / CLOCKS_PER_SEC << endl;
return 0;
}
Why is the first one is much slower, and if the second one is faster why dont we always write code that way? Anyway does the second pratice has a name?
I also wondered what happens if I create the variables outside the for loop in the first case, but the result was the same. Why?
Upvotes: 10
Views: 1656
Reputation: 66234
Break the throw-this-all-away optimization if you want the computational crunch and your numbers become much more consistent. Ensuring the code to get the proper value is actually run and not entirely thrown out, I've assigned the results in both tests to a volatile local (which isn't exactly proper usage of volatile, but does a decent job of ensuring only the value-creation is the significant delta).
#include <iostream>
#include <ctime>
#include <cmath>
using namespace std;
int main()
{
clock_t start;
volatile double val;
for (int j=1;j<=10;j++)
{
// Case 1.
start = clock();
for (int i=0; i<2000000; i++)
{
double a = cos(1);
double b = pow(a, 2);
val = sqrt(b);
}
cout << j << ':' << (double) (clock() - start) / CLOCKS_PER_SEC << endl;
// Case 2.
start = clock();
for (int i=0; i<2000000; i++)
val = sqrt(pow(cos(1),2));
cout << j << ':' << (double) (clock() - start) / CLOCKS_PER_SEC << endl << endl;
}
return 0;
}
Produces the following release-compiled output on my Macbook Air (which is no speed demon by any stretch):
1:0.001465
1:0.001305
2:0.001292
2:0.001424
3:0.001297
3:0.001351
4:0.001366
4:0.001342
5:0.001196
5:0.001376
6:0.001341
6:0.001303
7:0.001396
7:0.001422
8:0.001429
8:0.001427
9:0.001408
9:0.001398
10:0.001317
10:0.001353
Upvotes: 4
Reputation: 275585
A proper and legal full optimization of both loops above is "do not even do the loop". You could easily be seeing a case where you have confused the compiler by using an uninitialized variable in the first case, or maybe your use of variables confuses it, or maybe your optimization level forces named variables to actually exist.
Now there is a difference between the two in C++11 involving implicit moves of temporary variables, but you can fix this with use of std::move
. (I am not sure, but the last use of a local variable that is going out of scope may qualify for implicit move). For a double
this is not a difference, but for more complex types this can be.
Upvotes: 0