jdm
jdm

Reputation: 10060

Library for manipulating machine code at runtime?

Is there any C library for manipulating x86 / x64 machine code? Specifically, I'd like to modify a function in my program's address space at runtime.

For example, I have the functions foo and bar, for which I have the source or knowledge of their inner workings, but can't recompile the library they're in, and I have the function baz I wrote myself. Now I'd like to be able to say things like: "In the function foo, find the call to bar, and inject the instructions of baz right in front of it". The tool would have to adjust all the relevant addresses in the program accordingly.

I know all the bits and pieces exist, for example there are tools to do hotpatching of functions. I guess there are some restrictions on what would be possible, due to optimization and so on, but the basic functionality should be possible. I wasn't able to find anything like this, does anybody have some links?

Upvotes: 3

Views: 456

Answers (2)

cirrus
cirrus

Reputation: 5662

This is known as 'self modifying code' (see wikipedia) and it used to be quite trendy in the 80s and early 90s. Particularly in machine code and ASM, however, it pretty much died out as an approach with modern languages because it's pretty fragile. Managed languages attempted to provide a more secure model as it was also the basis for a buffer-overrun attack.

Bearing in mind your code pages may be marked as read-only or copy-on-write and you might get access violation on many modern OS's, but if memory serves me, the basic principle you need to get hold of a memory address of a variable, or function, and you need to have quite specific knowledge about the code generated and/or stack layout at that location.

Here's some links to get you started;

  1. How to write self-modifying code in x86 assembly
  2. Self-modifying code for debug tracing in quasi-C

Specifically, in your case, I wouldn't modify foo by inserting operations and then trying to adjust all the code, all you need to do is change the jump address to bar to go through an intermediary. This is known as a Thunk. The advantage of doing it this way is that it's much less fragile to modify a jump address from one to another because it doesn't change the structure of the original function, just a number. In fact it's trivial by comparison.

In your thunk you can do whatever operations you like before and after you call the real function. If you're already in the same address space and your thunk code is loaded you're home.

If you're on Windows, you might also want to take a look at Detours.

Upvotes: 3

Adrian Panasiuk
Adrian Panasiuk

Reputation: 7343

If you're using gcc and you want to substitute a whole function, you can redirect, wrap a function using a -Wl,wrap,functionName switch : https://stackoverflow.com/a/617606/111160 .

Then anytime the code wants to access, call functionName it runs __wrap_functionName which you provide. You can still access the original with __real_functionName.

If you want to perform some actions before each call to baz, make your __wrap_baz do that actions and call __real_baz afterwards.

Upvotes: 1

Related Questions