Reputation: 10060
Is there any C library for manipulating x86 / x64 machine code? Specifically, I'd like to modify a function in my program's address space at runtime.
For example, I have the functions foo
and bar
, for which I have the source or knowledge of their inner workings, but can't recompile the library they're in, and I have the function baz
I wrote myself. Now I'd like to be able to say things like: "In the function foo
, find the call to bar
, and inject the instructions of baz
right in front of it". The tool would have to adjust all the relevant addresses in the program accordingly.
I know all the bits and pieces exist, for example there are tools to do hotpatching of functions. I guess there are some restrictions on what would be possible, due to optimization and so on, but the basic functionality should be possible. I wasn't able to find anything like this, does anybody have some links?
Upvotes: 3
Views: 456
Reputation: 5662
This is known as 'self modifying code' (see wikipedia) and it used to be quite trendy in the 80s and early 90s. Particularly in machine code and ASM, however, it pretty much died out as an approach with modern languages because it's pretty fragile. Managed languages attempted to provide a more secure model as it was also the basis for a buffer-overrun attack.
Bearing in mind your code pages may be marked as read-only or copy-on-write and you might get access violation on many modern OS's, but if memory serves me, the basic principle you need to get hold of a memory address of a variable, or function, and you need to have quite specific knowledge about the code generated and/or stack layout at that location.
Here's some links to get you started;
Specifically, in your case, I wouldn't modify foo
by inserting operations and then trying to adjust all the code, all you need to do is change the jump
address to bar
to go through an intermediary. This is known as a Thunk
. The advantage of doing it this way is that it's much less fragile to modify a jump address from one to another because it doesn't change the structure of the original function, just a number. In fact it's trivial by comparison.
In your thunk
you can do whatever operations you like before and after you call the real function. If you're already in the same address space and your thunk code is loaded you're home.
If you're on Windows, you might also want to take a look at Detours.
Upvotes: 3
Reputation: 7343
If you're using gcc
and you want to substitute a whole function, you can redirect, wrap a function using a -Wl,wrap,functionName
switch : https://stackoverflow.com/a/617606/111160 .
Then anytime the code wants to access, call functionName
it runs __wrap_functionName
which you provide. You can still access the original with __real_functionName
.
If you want to perform some actions before each call to baz
, make your __wrap_baz
do that actions and call __real_baz
afterwards.
Upvotes: 1