Chiel
Chiel

Reputation: 6194

Implement an identifier for a class or use dynamic_cast

My question relates to What's the point of IsA() in C++?. I have a performance critical code that contains at a certain spot treatment of specific functions from derived classes, where only the base pointer is available. What is the best way of checking which derived class we have? I have coded up two options, in the second option I could eliminate the Animal_type enum and the get_type() function.

#include <iostream>

enum Animal_type { Dog_type, Cat_type };

struct Animal
{
    virtual Animal_type get_type() const = 0;
};

struct Dog : Animal
{
    void go_for_walk() const { std::cout << "Walking. Woof!" << std::endl; }
    Animal_type get_type() const { return Dog_type; }
};

struct Cat : Animal
{
    void be_evil() const { std::cout << "Being evil!" << std::endl; }
    Animal_type get_type() const { return Cat_type; }
};

void action_option1(Animal* animal)
{
    if (animal->get_type() == Dog_type)
        dynamic_cast<Dog*>(animal)->go_for_walk();
    else if (animal->get_type() == Cat_type)
        dynamic_cast<Cat*>(animal)->be_evil();
    else
        return;
}

void action_option2(Animal* animal)
{
    Dog* dog = dynamic_cast<Dog*>(animal);
    if (dog)
    {
        dog->go_for_walk();
        return;
    }

    Cat* cat = dynamic_cast<Cat*>(animal);
    if (cat)
    {
        cat->be_evil();
        return;
    }

    return;
}

int main()
{
    Animal* cat = new Cat();
    Animal* dog = new Dog();

    action_option1(cat);
    action_option2(cat);

    action_option1(dog);
    action_option2(dog);

    return 0;
}

Upvotes: 1

Views: 382

Answers (3)

Quuxplusone
Quuxplusone

Reputation: 27210

Your first option will be faster, but only if you fix the erroneous dynamic_cast (it should be static_cast):

void action_option1_fixed(Animal* animal)
{
    if (animal->get_type() == Dog_type)
        static_cast<Dog*>(animal)->go_for_walk();
    else if (animal->get_type() == Cat_type)
        static_cast<Cat*>(animal)->be_evil();
}

The point of using manual dispatch on get_type() here is that it allows you to avoid the expensive call to __dynamic_cast in the C++ runtime. As soon as you've made that call into the runtime, you lose.

If you use the final qualifier on both Dog and Cat (that is, on every class in your program that you know will never have child classes), then you will have enough information to know that

dynamic_cast<Dog*>(animal)

can be implemented as a simple pointer comparison; but sadly (as of 2017) neither GCC nor Clang implements such an optimization. You can do the optimization by hand, without introducing a get_type method, by using the C++ typeid operator:

void action_option3(Animal* animal)
{
    static_assert(std::is_final_v<Dog> && std::is_final_v<Cat>, "");
    if (typeid(*animal) == typeid(Dog))
        static_cast<Dog*>(animal)->go_for_walk();
    else if (typeid(*animal) == typeid(Cat))
        static_cast<Cat*>(animal)->be_evil();
}

Compiling with clang++ -std=c++14 -O3 -S should show you the benefits of the third approach here.

action_option1 starts off with

    movq    %rdi, %rbx
    movq    (%rbx), %rax
    callq   *(%rax)
    cmpl    $1, %eax
    jne     LBB0_1
    movq    __ZTI6Animal@GOTPCREL(%rip), %rsi
    movq    __ZTI3Dog@GOTPCREL(%rip), %rdx
    xorl    %ecx, %ecx
    movq    %rbx, %rdi
    callq   ___dynamic_cast
    movq    %rax, %rdi
    addq    $8, %rsp
    popq    %rbx
    popq    %rbp
    jmp     __ZNK3Dog11go_for_walkEv ## TAILCALL

action_option1_fixed improves it to

    movq    %rdi, %rbx
    movq    (%rbx), %rax
    callq   *(%rax)
    cmpl    $1, %eax
    jne     LBB2_1
    movq    %rbx, %rdi
    addq    $8, %rsp
    popq    %rbx
    popq    %rbp
    jmp     __ZNK3Dog11go_for_walkEv ## TAILCALL

(notice that in the fixed version, the call to __dynamic_cast is gone, replaced by just a little pointer math).

action_option2 is actually shorter than action_option1 because it doesn't add a virtual call on top of the __dynamic_cast, but it's still awful:

    movq    %rdi, %rbx
    testq   %rbx, %rbx
    je      LBB1_3
    movq    __ZTI6Animal@GOTPCREL(%rip), %rsi
    movq    __ZTI3Dog@GOTPCREL(%rip), %rdx
    xorl    %ecx, %ecx
    movq    %rbx, %rdi
    callq   ___dynamic_cast
    testq   %rax, %rax
    je      LBB1_2
    movq    %rax, %rdi
    addq    $8, %rsp
    popq    %rbx
    popq    %rbp
    jmp     __ZNK3Dog11go_for_walkEv ## TAILCALL

And here's action_option3. It's small enough that I can just paste the entire function definition here, instead of excerpting:

__Z14action_option3P6Animal:
    testq   %rdi, %rdi
    je      LBB3_4
    movq    (%rdi), %rax
    movq    -8(%rax), %rax
    movq    8(%rax), %rax
    cmpq    __ZTS3Dog@GOTPCREL(%rip), %rax
    je      LBB3_5
    cmpq    __ZTS3Cat@GOTPCREL(%rip), %rax
    je      LBB3_6
    retq
LBB3_5:
    jmp     __ZNK3Dog11go_for_walkEv ## TAILCALL
LBB3_6:
    jmp     __ZNK3Cat7be_evilEv     ## TAILCALL
LBB3_4:
    pushq   %rbp
    movq    %rsp, %rbp
    callq   ___cxa_bad_typeid

The __cxa_bad_typeid cruft at the end is because it might be the case that animal == nullptr. You can eliminate that cruft by making your parameter of type Animal& instead of Animal*, so that the compiler knows it's non-null.

I tried adding this line at the top of the function:

if (animal == nullptr) __builtin_unreachable();

but sadly, Clang's implementation of typeid didn't pick up on that hint.

Upvotes: 1

It largely depends on how performance-critical your performance-critical code is. I've seen setups where even dynamic dispatch of virtual functions was too costly, so if you're in such territory, forget about dynamic_cast and hand-craft something.

I will assume you're fine with a virtual call or two, though. You will probably want to steer clear of dynamic_cast, as that is usually much slower than a dynamic dispatch.

Right now, you have N classes derived from the common base and M points in code where you need to take a decision based on the concrete derived class. The question is: which of N, M is more likely to change in the future? Are you more likely to add new derived classes, or introduce new points where type-decision matters? This answer will determine the best design for you.

If you're going to add new classes, but the number of type-discriminating places is fixed (and ideally small as well), the enumeration approach would be the best choice. Just use a static_cast instead of a dynamic_cast; if you know the actual runtime type, you don't need to access RTTI to do the conversion for you (unless virtual bases and a deeper inheritance hierarchy are involved).

On the other hand, if the list classes is fixed, but new type-discriminating operations are likely to be introduced (or if there's simply too many of them to maintain), consider the Visitor pattern instead. Give your Animal class a virtual visitor-accepting function:

virtual void accept(AnimalVisitor &v) = 0;

struct AnimalVisitor
{
  virtual void visit(Dog &dog) = 0;
  virtual void visit(Cat &cat) = 0;
};    

Then, each derived class will implement it:

void Dog::accept(AnimalVisitor &v)
{
  v.visit(*this);
}

void Cat::accept(AnimalVisitor &v)
{
  v.visit(*this);
}

And your operations will just use it:

void action(Animal *animal)
{
  struct Action : AnimalVisitor
  {
    void visit(Dog &d) override { d.go_for_walk(); }
    void visit(Cat &c) override { c.be_evil(); }
  };

  AnimalVisitor v;

  animal->accept(v);
}

If you're going to be adding both new derived classes and new operations, you can add non-abstract functions to the above visitor so that existing code which doesn't need to know about the new classes does not break:

struct AnimalVisitor
{
  virtual void visit(Dog &d) = 0;
  virtual void visit(Cat &c) = 0;
  virtual void visit(Parrot &p) {}
};

Upvotes: 2

Jens
Jens

Reputation: 9416

I want to quote the accepted answer to the question you are citing:

In modern C++ there is no point.

For your example, the easiest solution is to use dynamic dispatch:

struct Animal {
    virtual void action() = 0;
};

struct Dog{
    virtual void action()  { std::cout << "Walking. Woof!" << std::endl; }
};

struct Animal {
    virtual void action()  { std::cout << "Being evil!" << std::endl; }
};

int main()
{
    Animals* a[2] = {new Cat(), new Dog()};
    a[0]->action();
    a[1]->action();
    delete a[0];
    delete a[1];
    return 0;
 }

For more complex scenarios, you may consider design patterns such as Strategy, Template Method or Visitor.

If this really is a performance bottlenect, it may help to declare Dog and Cat as final.

Upvotes: 2

Related Questions