A way to wrap a C function call with no knowledge of the signature?

Question

Given a function with C bindings and an arbitrary signature, it is a simple matter to create a pointer to the function, pass it around, wrap it, and invoke it.

int fun(int x, int y)
{
   return x + y;
}
void* funptr()
{
   return (void*)&fun;
}
int wrapfun(int x, int y)
{
   // inject additional wrapper logic
   return ((int (*)(int, int))funptr())(x, y);
}

So long as the caller and callee follow the same calling convention and agree on the signature, everything works.

Now let's say I want to wrap a library with thousands of functions. I can use nm or readelf to grab the names of all the functions to be wrapped, but I'd rather not have to care about the signatures, or even need to include the library's associated header files.

In some cases, cleanly including the headers may not be an option, given cosmetic changes that occur between versions and platforms. For example:

// from openssl/ssl.h v0.9.8
SSL_CTX* SSL_CTX_new(SSL_METHOD* meth);
// from openssl/ssl.h v1.0.0
SSL_CTX* SSL_CTX_new(const SSL_METHOD* meth);

That is my background rationale, which you may leave or take. Regardless, my question is this:

Is there a way to write

// pseudocode
void wrapfun()
{
    return ((void (*)())funptr())();
}

such that the caller of wrapfun knows the signature of fun, but wrapfun itself doesn't have to?

Ryan Calhoun · Accepted Answer

If you look at the assembly produced from compiled C functions, you will see every function body wrapped by

pushq %rbp
movq  %rsp, %rbp
; body
leave
ret

http://en.wikipedia.org/wiki/X86_instruction_listings lists the leave instruction as an 80186 equivalent of (in AT&T syntax)

movq  %rbp, %rsp
popq  %rpb

So leave is just the inverse of the first two lines: save off the caller's stack frame and create our own stack frame, then unwind at the end.

The closing ret is the inverse of the call that got us here, and http://www.unixwiz.net/techtips/win32-callconv-asm.html shows the hidden push and pop of the instruction pointer register that occurs during these paired instructions.

The reason the void function pointer call doesn't work by itself, because of this assembly created for function wrapfun by the compiler. What we need to do is create the wrapper in such a way that it can hand the stack frame set up for it by the caller directly to the call of fun, without its own stack frame getting in the way. In other words, observe the C calling convention and violate it at the same time.

Consider a C prototype

int wrapfun(int x, int y);

paired with an assembly implementation (AT&T x86_64)

  .file "wrapfun.s"
  .globl wrapfun
  .type   wrapfun, @function
wrapfun:
  call    funptr
  jmp     *%rax
  .size   wrapfun, .-wrapfun

Basically, we skip the typical stack pointer and base pointer manipulation, because we want fun's stack to look exactly like my stack. The call to funptr will create his own stack space and save his result into register RAX. Because we've got no stack space of our own, and because our caller's IP is sitting nicely at the top of the stack, we can simply do an unconditional jump to the wrapped function, and let his ret jump all the way back. In this manner, once the function pointer is invoked, he will see the stack as it was set up by his caller.

If we need to use local variables, pass parameters to funptr, etc, we can always set up our stack, then tear it down prior to the call:

wrapfun:
  pushq   %rbp
  movl    %rsp, %rbp ; set up my stack
  call    funptr
  leave              ; tear down my stack
  jmp     *%rax

Alternatively, we could embed this logic into inline assembly, taking advantage of our knowledge of what the compiler will do before and after:

void wrapfun()
{
    void* p = funptr();
    __asm__(
        "movq -8(%rbp), %rax
	"
        "leave
	"
        "popq %rbx
	"
        "call *%rax
	"
        "pushq %rbx
	"
        "pushq %ebp
	"    // repeat initial function setup
        "movq %rsp, %rbp"   // so it can be torn down correctly
    );
}

This approach has the advantage of easier declaration of C local variables prior to the magic. The last local variable declared will be at RBP-sizeof(var), and we save it in RAX prior to tearing down the stack. Another possible benefit is the opportunity to use the C preprocessesor to, for example, inline 32-bit or 64-bit assembly without requiring separate source files.

EDIT: The disadvantage is now the requirement to save the IP into a register limits the application portability by requiring RBX to not be used by the caller.

In short, the answer is yes. It is definitely possible to wrap a function without knowing its signature, if you are willing to get your hands a little dirty. No promises as to portability ;).

A way to wrap a C function call with no knowledge of the signature?

Answers (2)

Related Questions