Reputation: 28490
Is the benchmark I present below a fair way to compare inheritance-based vs std::function
-based approach to polymorphism?
If one needs different objects that implement the same interface in different ways, and also needs to be able to put them under a container and swap one with another at run-time, the most popular solution is to use inheritance:
struct Base {
virtual void f() = 0;
virtual ~Base() = default;
};
struct Derived1 : Base {
virtual void f();
};
struct Derived2 : Base {
virtual void f();
};
Another solution is to have a single class, but swap the virtual
method for a std::function
:
struct Foo {
std::function<void()> f{};
};
auto foo1 = Foo{[]{ return /* impl like Derived1 */; }};
auto foo2 = Foo{[]{ return /* impl like Derived2 */; }};
(Some questions about the difference between the two approaches are here, here, and here.)
However, regardless of other pros and cons of either solution, I'm curious to measure the difference in performance with a benchmark.
I understand that the performance will obviously vary based on how std::function
is implemented as well as on the compiler and the options passed to it, the operating system, and who knows what else.
But with all these factors being held fixed, I think one can measure the difference in performance, if it exists at all.
I should clarify that my intention is see first-hand that indeed the difference between the two approaches is to be considered negligible unless in very peculiar usecases, as I've understood from the linked questions and other sources. Or to prove that my understanding is wrong and there is indeed an important difference in performance.
My attempt to write a benchmark is here:
A few explanations about various bits of it:
f
s above alter in different ways a global unsigned int
,
unsigned int RETURN{};
which I return
from main
, so to make sure that the body of those functions cannot be optimized away;Derived1::f
/Derived2::f
and foo1
/foo2
's lambda bodies (with respect to the snippets above) in a way that they alter the aforementioned global unsigned int
:
struct Base {
virtual void f() = 0;
virtual ~Base() = default;
};
struct Derived1 : Base {
virtual void f() { RETURN += 1; }
};
struct Derived2 : Base {
virtual void f() { RETURN += 2; }
};
struct Foo {
std::function<void()> f{};
};
auto const foo1 = Foo{[]{ RETURN += 1; }};
auto const foo2 = Foo{[]{ RETURN += 2; }};
bool
s that I use to randomly pick between Derived1
/foo1
and Derived2
/foo2
std::random_device rd;
std::mt19937 gen{rd()};
std::bernoulli_distribution randBool{0.5};
constexpr int N = 1000000;
std::array<bool, N> bools;
for (bool& b : bools) {
b = randBool(gen);
}
true
/false
which allow parametrising over the two cases, and Range-v3 to conveniently accumulate the time measurement performed for each call of the virtual
function/std::function
:
using Time = duration<double, std::milli>;
std::array<Time, 2> times; // 0: std::function-based, 1: inheritance-based
hana::for_each(hana::make_basic_tuple(hana::false_c, hana::true_c), [&](auto hb) {
constexpr bool B = hb;
auto const elapsed = ranges::accumulate(bools, Time{}, [](auto acc, auto b){
/* time measurement */;
});
times[!B] = elapsed;
});
bool
ean b
is the following, which is also templated on the compile-time bool
ean B
used to pick between the two cases being compared:
template<bool B>
constexpr auto bool2Obj = []{
if constexpr (B) {
return [](bool b){
return b
? foo1
: foo2;
};
} else {
using BasePtr = std::unique_ptr<Base>;
return [](bool b){
return b
? BasePtr{std::make_unique<Derived1>()}
: BasePtr{std::make_unique<Derived2>()};
};
}
}();
bool B
for the same reason as above, i.e. allowing picking each of the two cases being compared:
template<bool B>
constexpr auto call = []{
if constexpr (B) {
return [](Foo const& p){ p.f(); };
} else {
return [](std::unique_ptr<Base> const& p){ p->f(); };
}
}();
/* time measurement */
is the following:
auto obj = bool2Obj<B>(b);
auto const start = high_resolution_clock::now();
call<B>(obj);
auto const end = high_resolution_clock::now() - start;
return acc + Time{end};
where I've kept the random-picking of the object out of the measurement, leaving in the measurement only the call
.The result,
(i - f) / i
(where i
and f
are the runtimes of i
nheritance-based vs f
unction-based approaches), changes often sign; this is the case with both Clang and GCCstd::function
-based approach is faster:
0.0648057
0.0716398
0.0636759
0.0649676
0.0673908
0.0756509
0.0780861
0.0890416
0.090532
0.094767
std::function
-based approach is faster, for GCC, Clang + LLVM, Clang + GNU.Upvotes: 1
Views: 86