Reputation: 541

C++ Will virtual functions still work if I upcast a derived object to its base type

Consider the following:

B inherits from A and overrides the print function.
A has a static function which takes a void*, casts it to A and calls the virtual print function.

If the void* was originally a B will it call A::print or B::print?

#include <iostream>

class A {
public:
        static void w(void *p) {

                A *a = reinterpret_cast<A*>(p);
                a->print();
        }

        virtual void print() {
           std::cout << "A"  << std::endl;
        }
};

class B : public A {

public:
        void print() {
           std::cout << "B"  << std::endl;
        }
};

int main () {
        B b;
        A::w(&b);
}

This prints B for me.

It seems that the void* which has been casted to an A still knows about B's overridden print function. The reason why is not immediately clear.

Can someone explain to me if this is behavior that I can rely on or if it is just some fluke that works because it is a small example (like how returning a reference to a local variable won't always segfault in small programs).

Upvotes: 12

Answers (4)

yacc143

Reputation: 385

reinterpret_cast cannot work, if you consider how multiple inheritance is implemented in C++.

Basically, when casting around between related types, with multiple inheritance, a cast might involve adding an offset. Hence without knowing the source and destination type, the compiler cannot emit the correct instructions.

So, reinterpret casts are undoable for at least this usecase, hence they are defined as undefined.

The dangerous part here, even if you do not do multiple inheritance, is that modern compilers start to interpret this "undefined" behaviour as meaning that they can optimize the thing, it's containing block and so on away. Which is almost certainly valid by the C++ standard (undefined means exactly that, anything is fine), but can be a surprise to the developer, who tended to understand "undefined" as "the code output might not work in general".

Upvotes: 0

Pixelchemist

Reputation: 24956

First of all your reinterpret_cast is undefined. If you pass a A* to w it will be defined.

A * p = new B;
A::w(p);
delete p;

I suggest to use static_cast<A*>(p) instead, if w is always called using A*.

If you have a defined cast to void* and back the memory address stays constant. So a inside your w will be a valid A* if you pass a valid A* to w first.

The question why the program knows how to handle the call is related to a mechanic called "Virtual Table".

Note: This may be different for different compilers. I'll talk about how Visual Studio seems to handle it.

To give you a somewhat rough idea for simple inheritance:

The compiler will compile 2 print functions in your code: A::print (i.e. at address X) and B::print (i.e. at address Y).

The real memory footprint of a class containing virtual function (i.e.

struct A
{
  void print (void);
  size_t x;
};
struct B : A
{
  void print (void);
  size_t y;
};

) will be somewhat like

struct Real_A
{
  void * vtable;
  size_t x;
};
struct Real_B : Real_A
{
  size_t y;
};

Furthermore there will be two so called virtual tables, one for each class containing virtual functions or having a base class with virtual functions.

You can think of the vtable as a structure holding the "real" address for each function.

Upon compilation the compiler will create Vtables for each class (A and B): Each instance of A will have vtable = <Address of A Vtable> while each instance of B will have vtable = <Address of B Vtable>.

At runtime if a virtual function is called the program will look up the "real" address for the function from the Vtable which is stored at the address which is first element to each object of A or B

The following code is non-standard and not sane ... but it may give you an idea though...

#include <iostream>
struct A 
{
  virtual void print (void) { std::cout << "A called." << std::endl; }
  size_t x;
};

struct B : A 
{
  void print (void) { std::cout << "B called." << std::endl; }
};
// "Real" memory layout of A
struct Real_A
{
  void * vtable;
  size_t x_value;
};
// "Real" memory layout of B
struct Real_B : Real_A
{
  size_t y_value;
};
// "Pseudo virtual table structure for classes with 1 virtual function"
struct VT
{
  void * func_addr;
};

int main (void) 
{
  A * pa = new A;
  pa->x = 15;
  B * pb = new B;
  pb->x = 20;
  A * pa_b = new B;
  pa_b->x = 25;
  // reinterpret addrress of A and B objects as Real_A and Real_B
  Real_A& ra(*(Real_A*)pa);
  Real_B& rb(*(Real_B*)pb);
  // reinterpret addrress of B object through pointer to A as Real_B
  Real_B& rb_a(*(Real_B*)pa_b);
  // Print x_values to know whether we meet the class layout
  std::cout << "Value of ra.x_value = " << ra.x_value << std::endl;
  std::cout << "Value of rb.x_value = " << rb.x_value << std::endl;
  std::cout << "Value of rb.x_value = " << rb_a.x_value << std::endl;
  // Print vtable addresses
  std::cout << "VT of A through A*: " << ra.vtable << std::endl;
  std::cout << "VT of B through B*: " << rb.vtable << std::endl;
  std::cout << "VT of B through A*: " << rb_a.vtable << std::endl;
  // Reinterpret memory pointed to by the vtable address as VT objects
  VT& va(*(VT*)ra.vtable);
  VT& vb(*(VT*)rb.vtable);
  VT& vb_a(*(VT*)rb_a.vtable);
  // Print addresses of functions in the vtable
  std::cout << "FA of A through A*: " << va.func_addr << std::endl;
  std::cout << "FA of B through B*: " << vb.func_addr << std::endl;
  std::cout << "FA of B through A*: " << vb_a.func_addr << std::endl;

  delete pa;
  delete pb;
  delete pa_b;

  return 0;
}

Visual Studio 2013 output:

 Value of ra.x_value = 15
Value of rb.x_value = 20
Value of rb.x_value = 25
VT of A through A*: 00D9DC80
VT of B through B*: 00D9DCA0
VT of B through A*: 00D9DCA0
FA of A through A*: 00D914B0
FA of B through B*: 00D914AB
FA of B through A*: 00D914AB

gcc-4.8.1 output:

Value of ra.x_value = 15
Value of rb.x_value = 20
Value of rb.x_value = 25
VT of A through A*: 0x8048f38
VT of B through B*: 0x8048f48
VT of B through A*: 0x8048f48
FA of A through A*: 0x8048d40
FA of B through B*: 0x8048cc0
FA of B through A*: 0x8048cc0

https://ideone.com/iKyBB3

Note: No matter whether you access a B object through A* or B*, you'll find the same vtable address first and you'll find the same address contained in the vtable as well.

Upvotes: 2

billz

Reputation: 45420

your code has undefined behavior

§ 5.2.10 Reinterpret cast

7 Converting a prvalue of type “pointer to T1” to the type “pointer to T2” (where T1 and T2 are object types and where the alignment requirements of T2 are no stricter than those of T1) and back to its original type yields the original pointer value. The result of any other such pointer conversion is unspecified.

Upvotes: 8

king_nak

Reputation: 11513

Virtual functions are usually resolved by an implicit vtable. This is basically an array of function pointers for every virtual function in the class hierarchy. The compiler adds it as a "hidden member" to your class. When calling a virtual function, the corresponding entry in the vtable is called.

Now, when you create a class of type B, it implicitly has the B-vtable stored in the object. Casts do not affect this table.

Hence, when you cast your void * to A, the original vtable (of class B) is present, which points to B::print.

Note that this is implementation defined behaviour, and the standard does not guarantee anything about this. But most compilers will act like this

Upvotes: 5

C++ Will virtual functions still work if I upcast a derived object to its base type

Answers (4)

Related Questions