jinge
jinge

Reputation: 865

Is vftable[0] stores the first virtual function or RTTI Complete Object Locator?

It's known to us that C++ use a vftable to dynamicly decide which virtual function should be called. And I want to find out the mechanism behind it when we call virtual function. I have compiled the following code to assembly.

using namespace std;

class Animal {
    int age;
public:
    virtual void speak() {}
    virtual void wash() {}
};

class Cat : public Animal {
public:
    virtual void speak() {}
    virtual void wash() {}
};

void main()
{
    Animal* animal = new Cat;
    animal->speak();
    animal->wash();
}

The assembly code is massive. I don't quite understand the following part.

CONST   SEGMENT
??_7Cat@@6B@ DD FLAT:??_R4Cat@@6B@          ; Cat::`vftable'
    DD  FLAT:?speak@Cat@@UAEXXZ
    DD  FLAT:?wash@Cat@@UAEXXZ
CONST   ENDS

This part defines the vftable of Cat. But it have three entries. The first entry is RTTI Complete Object Locator. The second is Cat::speak. The third is Cat::wash. So I think vftable[0] should imply RTTI Complete Object Locator. But when I checking the assembly code in main PROC and Cat::Cat PROC, the invoke to animal->speak() is implemented by calling vftable[0], and the invoke to animal->wash() is implemented by calling vftable[4]. Why not vftable[4] and vftable[8]?

The assembly code of PROC main and Cat::Cat shows below.

_TEXT   SEGMENT
tv75 = -12                      ; size = 4
$T1 = -8                        ; size = 4
_animal$ = -4                       ; size = 4
_main   PROC

; 23   : {

    push    ebp
    mov ebp, esp
    sub esp, 12                 ; 0000000cH

; 24   :    Animal* animal = new Cat;

    push    8
    call    ??2@YAPAXI@Z                ; operator new
    add esp, 4
    mov DWORD PTR $T1[ebp], eax
    cmp DWORD PTR $T1[ebp], 0
    je  SHORT $LN3@main
    mov ecx, DWORD PTR $T1[ebp]
    call    ??0Cat@@QAE@XZ
    mov DWORD PTR tv75[ebp], eax
    jmp SHORT $LN4@main
$LN3@main:
    mov DWORD PTR tv75[ebp], 0
$LN4@main:
    mov eax, DWORD PTR tv75[ebp]
    mov DWORD PTR _animal$[ebp], eax

; 25   :    animal->speak();

    mov ecx, DWORD PTR _animal$[ebp]
    mov edx, DWORD PTR [ecx]
    mov ecx, DWORD PTR _animal$[ebp]
    mov eax, DWORD PTR [edx]
    call    eax

; 26   :    animal->wash();

    mov ecx, DWORD PTR _animal$[ebp]
    mov edx, DWORD PTR [ecx]
    mov ecx, DWORD PTR _animal$[ebp]
    mov eax, DWORD PTR [edx+4]
    call    eax

; 27   : }

    xor eax, eax
    mov esp, ebp
    pop ebp
    ret 0
_main   ENDP
_TEXT   ENDS

; Function compile flags: /Odtp
;   COMDAT ??0Cat@@QAE@XZ
_TEXT   SEGMENT
_this$ = -4                     ; size = 4
??0Cat@@QAE@XZ PROC                 ; Cat::Cat, COMDAT
; _this$ = ecx
    push    ebp
    mov ebp, esp
    push    ecx
    mov DWORD PTR _this$[ebp], ecx
    mov ecx, DWORD PTR _this$[ebp]
    call    ??0Animal@@QAE@XZ
    mov eax, DWORD PTR _this$[ebp]
    mov DWORD PTR [eax], OFFSET ??_7Cat@@6B@
    mov eax, DWORD PTR _this$[ebp]
    mov esp, ebp
    pop ebp
    ret 0
??0Cat@@QAE@XZ ENDP                 ; Cat::Cat
_TEXT   ENDS

Supplement: MSVC Compiler x86 19.00.23026

Upvotes: 5

Views: 2532

Answers (1)

Ross Ridge
Ross Ridge

Reputation: 39641

The layout of vtables is implementation dependent. In your particular case, when compiling your example code, the Microsoft C++ compiler generates a vtable for Cat where the speak virtual function is at offset 0, and the wash function is at offset 4. The RTTI information is located before these functions at offset -4.

The problem here is that Microsoft's assembly output is lying. The generated assembly code puts the RTTI information at offset 0 and the speak and wash functions at offset 4 and 8. However this is not how it's actually laid out in the object file the compiler generates. Disassembling the object file reveals this layout:

                .new_section .rdata, "dr2"
0000    00 00 00 00                                     .long   ??_R4Cat@@6B@
0004                          ??_7Cat@@6B@:
0004    00 00 00 00                                     .long   ?speak@Cat@@UAEXXZ
0008    00 00 00 00                                     .long   ?wash@Cat@@UAEXXZ

Unfortunately the assembly output of Microsoft's C/C++ compiler is meant only to be informational. It's not an accurate and complete representation of the actual code the compiler generates. In particular it can't be reliably assembled into a working object file.

Upvotes: 5

Related Questions