JiaHao Xu
JiaHao Xu

Reputation: 2748

How does llvm know whether a member function pointer pointed to a virtual function?

After writing some code concerning member function pointer and reading Itanium ABI#Member function pointers, I have understood the layout of the member function pointer in llvm.

But what blows my mind is how to get the address of the function using the member function pointer. I found no way to tell whether a member function pointer mfptr points to a virtual member function or a normal member function.

Upvotes: 2

Views: 798

Answers (2)

Gabriella Giordano
Gabriella Giordano

Reputation: 1238

I'm not sure that this is what llvm does exactly, but I've found that the difference between a normal member function and a virtual function pointer for the Itanium ABI is generally in the structure of the record itself, like explained here:

The form of a virtual function pointer is specified by the processor-specific C++ ABI for the implementation. In the specific case of 64-bit Itanium shared library builds, a virtual function pointer entry contains a pair of components (each 64 bits): the value of the target GP value and the actual function address. That is, rather than being a normal function pointer, which points to such a two-component descriptor, a virtual function pointer entry is the descriptor.

That is a normal function pointer is an address, while a virtual function pointer is a descriptor made by the global position offset (GP) and the effective address of the virtual function override. Now I guess that the size of the record and some sort of decoration (if think it is the '1' that is mentioned in the link you pointed) makes it possible to distinct one type of pointer from the other.

EDIT

I've found another hint on the definition of the vtable records w.r.t. to virtual function members in the implementation of the TargetCXXABI class of the clang front-end for llvm. This class exposes an API (line 181) that tells if the body of a member function is aligned or not.

This, as PaulR already stated in his answer, confirms the fact that the LSB is used to differentiate a normal member function from a virtual one. But this works not because of the pointer size and alignment - the minimal addressable unit is always a byte, so pointers can be odd numbers - but because in the Itanium C++ ABI the body of normal members functions is aligned so that their addresses are always even numbers by design.

This is not always the case though, and indeed in the implementation of this method it is mentioned the fact that some architectures (e.g. ARM) store the discriminator in the adjustment of the this pointer, rather than in the function address.

So this is feature is truly tied to the processor architecture, and apart the general rule of the LSB +1 for x86_64 you should check the Itanium-like C++ ABI of each one.

Upvotes: 2

PaulR
PaulR

Reputation: 3717

In the documentation you linked it says:

For a virtual function, it is 1 plus the virtual table offset (in bytes) of the function

Under Virtual Table Layout, it says:

offsets within the virtual table are determined by that allocation sequence and the natural ABI size and alignment

The offset must respect the alignment requirements of the function pointer.

The alignment requirements of a function pointer which is a "POD" type are specified in the corresponding C ABI. I assume that pointers are aligned to their size, so the address (and thus the offset) of a pointer must be an even number, and its least significant bit must be zero.

So the implementation can just check the LSB of the offset/pointer field and know that if and only if the LSB is one it is dealing with a virtual method.

Once it has the offset in the virtual table, it reads the virtual table pointer from the object and loads the function's actual address from the virtual table using the offset from the member pointer.

class C {
    virtual int someMethod();
};

int invokeAMethod(C *c, int (C::*method)()) {
    return (c->*method)();
}

On x86_64 clang indeed creates a check for the LSB of the "ptr" member of the method pointer:

invokeAMethod(C*, int (C::*)()): # @invokeAMethod(C*, int (C::*)())
  // c is in rdi, method.adj is in rdx, and method.ptr is in rdx
  // adjust this pointer
  add rdi, rdx
  // check whether method is virtual
  test sil, 1
  // if it is not, skip the following
  je .LBB0_2
  // load the vtable pointer from the object
  mov rax, qword ptr [rdi]
  // index into the vtable with the corrected offset to load actual method address
  mov rsi, qword ptr [rax + rsi - 1]
.LBB0_2:
  // here the actual address of the method is in rsi, we call it
  // in this particular case we return the same type
  // and do not need to call any destructors
  // so we can tail call
  jmp rsi # TAILCALL

I could not share the godbolt link for this particular example because one of my browser plugins interfered, but you can play with similar examples yourself on https://gcc.godbolt.org.

Upvotes: 1

Related Questions