Optimization of virtual table lookups

Question

With code like below, can a compiler tell that a is in fact an instance of B and optimize away the virtual table lookup?

#include 

class A
{
  public:
    virtual void f()
    {
        std::cout << "A::f()" << std::endl;
    }
};

class B : public A
{
  public:
    void f()
    {
        std::cout << "B::f()" << std::endl;
    }
};

int main()
{
    B b;
    A* a = &b;
    a->f();

    return 0;
}

Additional question after the answers of Jonthan Seng and reima: In case gcc is used, would it be necessary to use any flags to force it to optimize the vtable lookup?

reima · Accepted Answer

Clang can easily make this optimization, and even inlines the function call. This can be seen from the generated assembly:

Dump of assembler code for function main():
   0x0000000000400500 <+0>: push   %rbp
   0x0000000000400501 <+1>: mov    %rsp,%rbp
   0x0000000000400504 <+4>: mov    $0x40060c,%edi
   0x0000000000400509 <+9>: xor    %al,%al
   0x000000000040050b <+11>:  callq  0x4003f0 
   0x0000000000400510 <+16>:  xor    %eax,%eax
   0x0000000000400512 <+18>:  pop    %rbp
   0x0000000000400513 <+19>:  retq

I took the liberty of replacing std::cout << … by equivalent calls to printf, as this greatly reduces the clutter in the disassembly.

GCC 4.6 can also deduce that no vtable lookup is needed, but does not inline:

Dump of assembler code for function main():
   0x0000000000400560 <+0>: sub    $0x18,%rsp
   0x0000000000400564 <+4>: mov    %rsp,%rdi
   0x0000000000400567 <+7>: movq   $0x4007c0,(%rsp)
   0x000000000040056f <+15>:  callq  0x400680 
   0x0000000000400574 <+20>:  xor    %eax,%eax
   0x0000000000400576 <+22>:  add    $0x18,%rsp
   0x000000000040057a <+26>:  retq

Optimization of virtual table lookups

Answers (2)

Related Questions