C: Casting function pointers to edit executable code
Suppose the following code:
int f1(int x, int y);
int main(void) {
printf("%d", f1(2,3));
char* x = (char*)&f1;
//edit the stuff here?
printf("%d", f1(2,3));
return 0;
}
int f1(int x, int y){
return x*y;
}
Now, despite the fact that this is really stupid and you'd never practically do it, how would you go about this? If I wanted to make this function add the two numbers, rather than subtract, or even go about something more complex, what steps should I take? How do I figure out what the correct machine code is to get the desired result?
Answers (1)
While this is not impossible, it is usually impractical, as there are significant hurdles, including:
- The memory containing machine instructions is usually marked non-writable, so you must change the protection settings before it can be changed. (On Unix systems, look into
mprotect
.)
- The changes that need to be made are architecture-dependent—each processor architecture has its own instructions and its own encodings of those instructions. You have to figure out what new instructions to write and how to encode them.
- There is widely available software to encode instructions, namely an assembler. To turn instructions into encodings on the fly, one might embed an assembler in a program or invoke one in another process. Then there is the problem of how to extract the instructions from the output of the assembler (an object file). An assembler can also be used in the ordinary way to prepare a given instruction sequence, rather than on the fly in a running program.
- Instructions just after they have been encoded by an assembler are generally represented in an intermediate form, where references to specific places in memory are either symbolic or are relative to certain reference points. Before putting them into a running program, these references must be resolved to the actual addresses in the program.
- Once you know what instructions you want to replace in the function, there is the problem that they might be bigger than the instructions to be replaced, so they will not fit. Then you need a workaround, such as finding an alternate place for your instructions and, instead of putting your instructions in the function, putting in a small instruction that jumps to your instructions. Note that the memory you use for your instructions must be marked as executable.
- After you change main memory, the instruction cache might hold an old copy of the instructions, so you need to flush the instruction cache.
A simple change of subtraction to addition might be accomplished by finding the pertinent subtract instruction, changing the memory protection, writing an add instruction over the subtract, and flushing instruction cache. Anything beyond that will be more complicated.