Virtual tables and virtual pointers for multiple virtual inheritance and type casting

Artium picture Artium · Jul 24, 2010 · Viewed 11.9k times · Source

I am little confused about vptr and representation of objects in the memory, and hope you can help me understand the matter better.

  1. Consider B inherits from A and both define virtual functions f(). From what I learned the representation of an object of class B in the memory looks like this:[ vptr | A | B ] and the vtbl that vptr points to contains B::f(). I also understood that casting the object from B to A does nothing except ignoring the B part at the end of the object. Is it true? Doesn't this behavior is wrong? We want that object of type A to execute A::f() method and not B::f().

  2. Are there a number of vtables in the system as the number of classes?

  3. How will a vtable of class that inherits from two or more classes look like? How will the object of C be represented in the memory?

  4. Same as question 3 but with virtual inheritance.

Answer

P Shved picture P Shved · Jul 24, 2010

The following is true for GCC (and it seems true for LLVM link), but may also be true for the compiler you're using. All these is implementation-dependent, and is not governed by C++ standard. However, GCC write its own binary standard document, Itanium ABI.

I tried to explain basic concepts of how virtual tables are laid out in more simple words as a part of my article about virtual function performance in C++, which you may find useful. Here are answers to your questions:

  1. A more correct way to depict internal representation of the object is:

    | vptr | ======= | ======= |  <-- your object
           |----A----|         |
           |---------B---------|
    

    B contains its base class A, it just adds a couple of his own members after its end.

    Casting from B* to A* indeed does nothing, it returns the same pointer, and vptr remains the same. But, in a nutshell, virtual functions are not always called via vtable. Sometimes they're called just like the other functions.

    Here's more detailed explanation. You should distinguish two ways of calling member function:

    A a, *aptr;
    a.func();         // the call to A::func() is precompiled!
    aptr->A::func();  // ditto
    aptr->func();     // calls virtual function through vtable.
                      // It may be a call to A::func() or B::func().
    

    The thing is that it's known at compile time how the function will be called: via vtable or just will be a usual call. And the thing is that the type of a casting expression is known at compile time, and therefore the compiler chooses the right function at compile time.

    B b, *bptr;          
    static_cast<A>(b)::func(); //calls A::func, because the type
       // of static_cast<A>(b) is A!
    

    It doesn't even look inside vtable in this case!

  2. Generally, no. A class can have several vtables if it inherits from several bases, each having its own vtable. Such set of virtual tables forms a "virtual table group" (see pt. 3).

    Class also needs a set of construction vtables, to correctly distpatch virtual functions when constructing bases of a complex object. You can read further in the standard I linked.

  3. Here's an example. Assume C inherits from A and B, each class defining virtual void func(), as well as a,b or c virtual function relevant to its name.

    The C will have a vtable group of two vtables. It will share one vtable with A (the vtable where the own functions of the current class go is called "primary"), and a vtable for B will be appended:

    | C::func()   |   a()  |  c()  || C::func()  |   b()   |
    |---- vtable for A ----|        |---- vtable for B ----| 
    |--- "primary virtual table" --||- "secondary vtable" -|
    |-------------- virtual table group for C -------------|
    

    The representation of object in memory will look nearly the same way its vtable looks like. Just add a vptr before every vtable in a group, and you'll have a rough estimate how the data are laid out inside the object. You may read about it in the relevant section of the GCC binary standard.

  4. Virtual bases (some of them) are laid out at the end of vtable group. This is done because each class should have only one virtual base, and if they were mingled with "usual" vtables, then compiler couldn't re-use parts of constructed vtables to making those of derived classes. This would lead to computing unnecessary offsets and would decrease performance.

    Due to such a placement, virtual bases also introduce into their vtables additional elements: vcall offset (to get address of a final overrider when jumping from the pointer to a virtual base inside a complete object to the beginning of the class that overrides the virtual function) for each virtual function defined there. Also each virtual base adds vbase offsets, which are inserted into vtable of the derived class; they allow to find where the data of the virtual base begin (it can't be precompiled since the actual address depends on the hierarchy: virtual bases are at the end of object, and the shift from beginning varies depending on how many non-virtual classes the current class inherits.).

Woof, I hope I didn't introduce much unnecessary complexity. In any case, you may refer to the original standard, or to any document of your own compiler.