Reputation: 2363
Given a function with C bindings and an arbitrary signature, it is a simple matter to create a pointer to the function, pass it around, wrap it, and invoke it.
int fun(int x, int y)
{
return x + y;
}
void* funptr()
{
return (void*)&fun;
}
int wrapfun(int x, int y)
{
// inject additional wrapper logic
return ((int (*)(int, int))funptr())(x, y);
}
So long as the caller and callee follow the same calling convention and agree on the signature, everything works.
Now let's say I want to wrap a library with thousands of functions. I can use nm
or readelf
to grab the names of all the functions to be wrapped, but I'd rather not have to care about the signatures, or even need to include the library's associated header files.
In some cases, cleanly including the headers may not be an option, given cosmetic changes that occur between versions and platforms. For example:
// from openssl/ssl.h v0.9.8
SSL_CTX* SSL_CTX_new(SSL_METHOD* meth);
// from openssl/ssl.h v1.0.0
SSL_CTX* SSL_CTX_new(const SSL_METHOD* meth);
That is my background rationale, which you may leave or take. Regardless, my question is this:
Is there a way to write
// pseudocode
void wrapfun()
{
return ((void (*)())funptr())();
}
such that the caller of wrapfun
knows the signature of fun
, but wrapfun
itself doesn't have to?
Upvotes: 2
Views: 584
Reputation: 2363
If you look at the assembly produced from compiled C functions, you will see every function body wrapped by
pushq %rbp
movq %rsp, %rbp
; body
leave
ret
http://en.wikipedia.org/wiki/X86_instruction_listings lists the leave
instruction as an 80186 equivalent of (in AT&T syntax)
movq %rbp, %rsp
popq %rpb
So leave
is just the inverse of the first two lines: save off the caller's stack frame and create our own stack frame, then unwind at the end.
The closing ret
is the inverse of the call
that got us here, and http://www.unixwiz.net/techtips/win32-callconv-asm.html shows the hidden push and pop of the instruction pointer register that occurs during these paired instructions.
The reason the void function pointer call doesn't work by itself, because of this assembly created for function wrapfun
by the compiler. What we need to do is create the wrapper in such a way that it can hand the stack frame set up for it by the caller directly to the call of fun
, without its own stack frame getting in the way. In other words, observe the C calling convention and violate it at the same time.
Consider a C prototype
int wrapfun(int x, int y);
paired with an assembly implementation (AT&T x86_64)
.file "wrapfun.s"
.globl wrapfun
.type wrapfun, @function
wrapfun:
call funptr
jmp *%rax
.size wrapfun, .-wrapfun
Basically, we skip the typical stack pointer and base pointer manipulation, because we want fun
's stack to look exactly like my stack. The call to funptr
will create his own stack space and save his result into register RAX
. Because we've got no stack space of our own, and because our caller's IP
is sitting nicely at the top of the stack, we can simply do an unconditional jump to the wrapped function, and let his ret
jump all the way back. In this manner, once the function pointer is invoked, he will see the stack as it was set up by his caller.
If we need to use local variables, pass parameters to funptr
, etc, we can always set up our stack, then tear it down prior to the call:
wrapfun:
pushq %rbp
movl %rsp, %rbp ; set up my stack
call funptr
leave ; tear down my stack
jmp *%rax
Alternatively, we could embed this logic into inline assembly, taking advantage of our knowledge of what the compiler will do before and after:
void wrapfun()
{
void* p = funptr();
__asm__(
"movq -8(%rbp), %rax\n\t"
"leave\n\t"
"popq %rbx\n\t"
"call *%rax\n\t"
"pushq %rbx\n\t"
"pushq %ebp\n\t" // repeat initial function setup
"movq %rsp, %rbp" // so it can be torn down correctly
);
}
This approach has the advantage of easier declaration of C local variables prior to the magic. The last local variable declared will be at RBP-sizeof(var), and we save it in RAX prior to tearing down the stack. Another possible benefit is the opportunity to use the C preprocessesor to, for example, inline 32-bit or 64-bit assembly without requiring separate source files.
EDIT: The disadvantage is now the requirement to save the IP into a register limits the application portability by requiring RBX
to not be used by the caller.
In short, the answer is yes. It is definitely possible to wrap a function without knowing its signature, if you are willing to get your hands a little dirty. No promises as to portability ;).
Upvotes: 5
Reputation: 1
In addition to Ryan's answer you should also consider using libffi (foreign function interface library, which may be within your GCC compiler). It is fit for your goals, and abstract the details "portably" (for the many architectures, systems, and ABI supported by it).
Upvotes: 2