Reputation: 6194
My question relates to What's the point of IsA() in C++?. I have a performance critical code that contains at a certain spot treatment of specific functions from derived classes, where only the base pointer is available. What is the best way of checking which derived class we have? I have coded up two options, in the second option I could eliminate the Animal_type
enum and the get_type()
function.
#include <iostream>
enum Animal_type { Dog_type, Cat_type };
struct Animal
{
virtual Animal_type get_type() const = 0;
};
struct Dog : Animal
{
void go_for_walk() const { std::cout << "Walking. Woof!" << std::endl; }
Animal_type get_type() const { return Dog_type; }
};
struct Cat : Animal
{
void be_evil() const { std::cout << "Being evil!" << std::endl; }
Animal_type get_type() const { return Cat_type; }
};
void action_option1(Animal* animal)
{
if (animal->get_type() == Dog_type)
dynamic_cast<Dog*>(animal)->go_for_walk();
else if (animal->get_type() == Cat_type)
dynamic_cast<Cat*>(animal)->be_evil();
else
return;
}
void action_option2(Animal* animal)
{
Dog* dog = dynamic_cast<Dog*>(animal);
if (dog)
{
dog->go_for_walk();
return;
}
Cat* cat = dynamic_cast<Cat*>(animal);
if (cat)
{
cat->be_evil();
return;
}
return;
}
int main()
{
Animal* cat = new Cat();
Animal* dog = new Dog();
action_option1(cat);
action_option2(cat);
action_option1(dog);
action_option2(dog);
return 0;
}
Upvotes: 1
Views: 382
Reputation: 27210
Your first option will be faster, but only if you fix the erroneous dynamic_cast
(it should be static_cast
):
void action_option1_fixed(Animal* animal)
{
if (animal->get_type() == Dog_type)
static_cast<Dog*>(animal)->go_for_walk();
else if (animal->get_type() == Cat_type)
static_cast<Cat*>(animal)->be_evil();
}
The point of using manual dispatch on get_type()
here is that it allows you to avoid the expensive call to __dynamic_cast
in the C++ runtime. As soon as you've made that call into the runtime, you lose.
If you use the final
qualifier on both Dog
and Cat
(that is, on every class in your program that you know will never have child classes), then you will have enough information to know that
dynamic_cast<Dog*>(animal)
can be implemented as a simple pointer comparison; but sadly (as of 2017) neither GCC nor Clang implements such an optimization. You can do the optimization by hand, without introducing a get_type
method, by using the C++ typeid
operator:
void action_option3(Animal* animal)
{
static_assert(std::is_final_v<Dog> && std::is_final_v<Cat>, "");
if (typeid(*animal) == typeid(Dog))
static_cast<Dog*>(animal)->go_for_walk();
else if (typeid(*animal) == typeid(Cat))
static_cast<Cat*>(animal)->be_evil();
}
Compiling with clang++ -std=c++14 -O3 -S
should show you the benefits of the third approach here.
action_option1
starts off with
movq %rdi, %rbx
movq (%rbx), %rax
callq *(%rax)
cmpl $1, %eax
jne LBB0_1
movq __ZTI6Animal@GOTPCREL(%rip), %rsi
movq __ZTI3Dog@GOTPCREL(%rip), %rdx
xorl %ecx, %ecx
movq %rbx, %rdi
callq ___dynamic_cast
movq %rax, %rdi
addq $8, %rsp
popq %rbx
popq %rbp
jmp __ZNK3Dog11go_for_walkEv ## TAILCALL
action_option1_fixed
improves it to
movq %rdi, %rbx
movq (%rbx), %rax
callq *(%rax)
cmpl $1, %eax
jne LBB2_1
movq %rbx, %rdi
addq $8, %rsp
popq %rbx
popq %rbp
jmp __ZNK3Dog11go_for_walkEv ## TAILCALL
(notice that in the fixed version, the call to __dynamic_cast
is gone, replaced by just a little pointer math).
action_option2
is actually shorter than action_option1
because it doesn't add a virtual call on top of the __dynamic_cast
, but it's still awful:
movq %rdi, %rbx
testq %rbx, %rbx
je LBB1_3
movq __ZTI6Animal@GOTPCREL(%rip), %rsi
movq __ZTI3Dog@GOTPCREL(%rip), %rdx
xorl %ecx, %ecx
movq %rbx, %rdi
callq ___dynamic_cast
testq %rax, %rax
je LBB1_2
movq %rax, %rdi
addq $8, %rsp
popq %rbx
popq %rbp
jmp __ZNK3Dog11go_for_walkEv ## TAILCALL
And here's action_option3
. It's small enough that I can just paste the entire function definition here, instead of excerpting:
__Z14action_option3P6Animal:
testq %rdi, %rdi
je LBB3_4
movq (%rdi), %rax
movq -8(%rax), %rax
movq 8(%rax), %rax
cmpq __ZTS3Dog@GOTPCREL(%rip), %rax
je LBB3_5
cmpq __ZTS3Cat@GOTPCREL(%rip), %rax
je LBB3_6
retq
LBB3_5:
jmp __ZNK3Dog11go_for_walkEv ## TAILCALL
LBB3_6:
jmp __ZNK3Cat7be_evilEv ## TAILCALL
LBB3_4:
pushq %rbp
movq %rsp, %rbp
callq ___cxa_bad_typeid
The __cxa_bad_typeid
cruft at the end is because it might be the case that animal == nullptr
. You can eliminate that cruft by making your parameter of type Animal&
instead of Animal*
, so that the compiler knows it's non-null.
I tried adding this line at the top of the function:
if (animal == nullptr) __builtin_unreachable();
but sadly, Clang's implementation of typeid
didn't pick up on that hint.
Upvotes: 1
Reputation: 171127
It largely depends on how performance-critical your performance-critical code is. I've seen setups where even dynamic dispatch of virtual functions was too costly, so if you're in such territory, forget about dynamic_cast
and hand-craft something.
I will assume you're fine with a virtual call or two, though. You will probably want to steer clear of dynamic_cast
, as that is usually much slower than a dynamic dispatch.
Right now, you have N classes derived from the common base and M points in code where you need to take a decision based on the concrete derived class. The question is: which of N, M is more likely to change in the future? Are you more likely to add new derived classes, or introduce new points where type-decision matters? This answer will determine the best design for you.
If you're going to add new classes, but the number of type-discriminating places is fixed (and ideally small as well), the enumeration approach would be the best choice. Just use a static_cast
instead of a dynamic_cast
; if you know the actual runtime type, you don't need to access RTTI to do the conversion for you (unless virtual bases and a deeper inheritance hierarchy are involved).
On the other hand, if the list classes is fixed, but new type-discriminating operations are likely to be introduced (or if there's simply too many of them to maintain), consider the Visitor pattern instead. Give your Animal
class a virtual visitor-accepting function:
virtual void accept(AnimalVisitor &v) = 0;
struct AnimalVisitor
{
virtual void visit(Dog &dog) = 0;
virtual void visit(Cat &cat) = 0;
};
Then, each derived class will implement it:
void Dog::accept(AnimalVisitor &v)
{
v.visit(*this);
}
void Cat::accept(AnimalVisitor &v)
{
v.visit(*this);
}
And your operations will just use it:
void action(Animal *animal)
{
struct Action : AnimalVisitor
{
void visit(Dog &d) override { d.go_for_walk(); }
void visit(Cat &c) override { c.be_evil(); }
};
AnimalVisitor v;
animal->accept(v);
}
If you're going to be adding both new derived classes and new operations, you can add non-abstract functions to the above visitor so that existing code which doesn't need to know about the new classes does not break:
struct AnimalVisitor
{
virtual void visit(Dog &d) = 0;
virtual void visit(Cat &c) = 0;
virtual void visit(Parrot &p) {}
};
Upvotes: 2
Reputation: 9416
I want to quote the accepted answer to the question you are citing:
In modern C++ there is no point.
For your example, the easiest solution is to use dynamic dispatch:
struct Animal {
virtual void action() = 0;
};
struct Dog{
virtual void action() { std::cout << "Walking. Woof!" << std::endl; }
};
struct Animal {
virtual void action() { std::cout << "Being evil!" << std::endl; }
};
int main()
{
Animals* a[2] = {new Cat(), new Dog()};
a[0]->action();
a[1]->action();
delete a[0];
delete a[1];
return 0;
}
For more complex scenarios, you may consider design patterns such as Strategy, Template Method or Visitor.
If this really is a performance bottlenect, it may help to declare Dog
and Cat
as final
.
Upvotes: 2