Reputation: 29844
I wrote the following benchmark:
#include <iostream> // cout
#include <math.h> // pow
#include <chrono> // high_resolution_clock
using namespace std;
using namespace std::chrono;
int64_t calculate(int);
int main()
{
high_resolution_clock::time_point t1, t2;
// Test 1
t1 = high_resolution_clock::now();
calculate(200);
t2 = high_resolution_clock::now();
cout << "RUNTIME = " << duration_cast<nanoseconds>(t2 - t1).count() << " nano seconds" << endl;
// Test 2
t1 = high_resolution_clock::now();
calculate(200000);
t2 = high_resolution_clock::now();
cout << "RUNTIME = " << duration_cast<nanoseconds>(t2 - t1).count() << " nano seconds" << endl;
}
int64_t calculate(const int max_exponent)
{
int64_t num = 0;
for(int i = 0; i < max_exponent; i++)
{
num += pow(2, i);
}
return num;
}
When running this benchmark on the Odroid XU3 the following output is produced (8 runs):
RUNTIME TEST 1 = 1250 nano seconds
RUNTIME TEST 2 = 1041 nano seconds
RUNTIME TEST 1 = 1292 nano seconds
RUNTIME TEST 2 = 1042 nano seconds
RUNTIME TEST 1 = 1250 nano seconds
RUNTIME TEST 2 = 1083 nano seconds
RUNTIME TEST 1 = 1292 nano seconds
RUNTIME TEST 2 = 1083 nano seconds
RUNTIME TEST 1 = 1209 nano seconds
RUNTIME TEST 2 = 1084 nano seconds
RUNTIME TEST 1 = 1166 nano seconds
RUNTIME TEST 2 = 1083 nano seconds
RUNTIME TEST 1 = 1292 nano seconds
RUNTIME TEST 2 = 1042 nano seconds
RUNTIME TEST 1 = 1166 nano seconds
RUNTIME TEST 2 = 1250 nano seconds
RUNTIME TEST 1 = 1250 nano seconds
RUNTIME TEST 2 = 1250 nano seconds
The second exponent is 1000 times greater the the first one. Why does the second call finish faster sometimes?
I used GCC (4.8) as Compiler with the -Ofast
flag.
Update: I could reproduce similar behaviour on my i7 4770k.
Upvotes: 1
Views: 145
Reputation: 490058
The short answer is "dead code elimination". The compiler sees that you never use the result from calling the function (and the function has no side effects), so it just eliminates calling the function.
Print out the result from the function, and things change a bit. E.g.:
Ignore: -9223372036854775808 RUNTIME = 0 nano seconds
Ignore: -9223372036854775808 RUNTIME = 23001300 nano seconds
Modified code, in case you care:
#include <iostream> // cout
#include <math.h> // pow
#include <chrono> // high_resolution_clock
using namespace std;
using namespace std::chrono;
int64_t calculate(int);
int main() {
high_resolution_clock::time_point t1, t2;
// Test 1
t1 = high_resolution_clock::now();
auto a = calculate(200);
t2 = high_resolution_clock::now();
std::cout << "Ignore: " << a << "\t";
cout << "RUNTIME = " << duration_cast<nanoseconds>(t2 - t1).count() << " nano seconds" << endl;
// Test 2
t1 = high_resolution_clock::now();
auto b = calculate(200000);
t2 = high_resolution_clock::now();
std::cout << "Ignore: " << b << "\t";
cout << "RUNTIME = " << duration_cast<nanoseconds>(t2 - t1).count() << " nano seconds" << endl;
}
int64_t calculate(const int max_exponent) {
int64_t num = 0;
for (int i = 0; i < max_exponent; i++) {
num += pow(2, i);
}
return num;
}
From there you have the minor detail that you're overflowing the range of an int64_t
(many times over) giving undefined behavior--but at least with this there's reasonable hope that the times printed out reflect the time to carry out the specified calculations.
Upvotes: 6
Reputation: 474
It probably happens with help of cache of your CPU Or, most probably, it is optimization of a compiler. Try disabling optimization with -O0 and compare results. I repeated it on my machine with and without "-O0" and got really different result.
Upvotes: -2