Reputation:
I wrote the following program to test how much virtual functions cost on my machine:
#include <iostream>
#include <ctime>
#define NUM_ITER 10000000000
// 5 seconds = 1000000000
static volatile int global_a;
void spin()
{
int a = global_a;
int b = a*a;
int c = a+5;
int d = a^b^c;
global_a = b*d;
}
struct A {
virtual void a() = 0;
};
struct B : A {
virtual void a() { spin(); }
};
struct C : A {
virtual void a() { spin(); }
};
void run_A1(A* a)
{
a->a();
}
void run_A(A* a)
{
for (long long i = 0; i < NUM_ITER; i++) {
run_A1(a);
}
}
void run()
{
for (long long i = 0; i < NUM_ITER; i++) {
spin();
}
}
int main()
{
global_a = 2;
A* a1 = new B;
A* a2 = new C;
std::clock_t c_begin, c_end;
c_begin = std::clock();
run_A(a1);
c_end = std::clock();
std::cout << "Virtual | CPU time used: "
<< 1000.0 * (c_end-c_begin) / CLOCKS_PER_SEC
<< " ms\n";
c_begin = std::clock();
run_A(a2);
c_end = std::clock();
std::cout << "Virtual | CPU time used: "
<< 1000.0 * (c_end-c_begin) / CLOCKS_PER_SEC
<< " ms\n";
c_begin = std::clock();
run();
c_end = std::clock();
std::cout << "Normal | CPU time used: "
<< 1000.0 * (c_end-c_begin) / CLOCKS_PER_SEC
<< " ms\n";
delete a1;
delete a2;
}
The results were opposite than I expected: the virtual functions were consistently faster. For example, this is one of the outputs I got with NUM_ITER = 10000000000
:
Virtual | CPU time used: 49600 ms
Virtual | CPU time used: 50270 ms
Normal | CPU time used: 52890 ms
From the analysis of the resulting assembler file I can confirm that the compiler hasn't optimized out anything important. I've used GCC-4.7 with the following options:
g++ -O3 -std=c++11 -save-temps -masm=intel -g0 -fno-exceptions -fno-inline test.cc -o test
Why are the virtual function calls faster? Or why are the non-virtual function calls slower? Have the branch predictors become so good? Or maybe it's just my machine. Maybe someone could also test and post his timings?
Upvotes: 4
Views: 681
Reputation: 16333
The compiler might be smart enough to see that the virtual functions call a global function spin()
and devirtualize them. The calls probably get inlined too.
Check this.
Upvotes: 2
Reputation: 55553
Try reseting global_a
before each call to run()
:
void run()
{
global_a = 2;
...
}
void run_A(A *a)
{
global_a = 2;
...
}
Not sure if this is having any impact, but not all mathematical operations take the same amount of time!
Upvotes: 4