Reputation: 379
I wonder if it's possible for a linux process to call code located in the memory of another process?
Let's say we have a function f() in process A and we want process B to call it. What I thought about is using mmap with MAP_SHARED and PROT_EXEC flags to map the memory containing the function code and pass the pointer to B, assuming, that f() will not call any other function from A binary. Will it ever work? If yes, then how do I determine the size of f() in memory?
=== EDIT ===
I know, that shared libraries will do exactly that, but I wonder if it's possible to dynamically share code between processes.
Upvotes: 4
Views: 1855
Reputation: 2446
Oh no! Anyways...
Here's the insane, unreasonable, not-good, purely academic demonstration of this capability. It was fun for me, I hope it's fun for you.
Program A
will use shm_open
to create a shared memory object, and mmap
to map it to its memory space. Then it it will copy some code from a function defined in A
to the shared memory. Then program B
will open up the shared memory, execute the function, and just for kicks, make a very simple modification to the code. Then A
will execute the code to demonstrate the change took effect.
Again, this is no recommendation for how to solve a problem, it's an academic demonstration.
// A.c
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <sys/stat.h>
int foo(int y) {
int x = 14;
return x + y;
}
int main(int argc, char *argv[]) {
const size_t mem_size = 0x1000;
// create shared memory objects
int shared_fd = shm_open("foobar2", O_RDWR | O_CREAT, 0777);
ftruncate(shared_fd, mem_size);
void *shared_mem =
mmap(NULL, mem_size, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_SHARED, shared_fd, 0);
// copy function to shared memory
const size_t fn_size = 24;
memcpy(shared_mem, &foo, fn_size);
// wait
getc(stdin);
// execute the shared function
int(*shared_foo)(int) = shared_mem;
printf("shared_foo(3) = %d\n", shared_foo(3));
// clean up
shm_unlink("foobar2");
}
Note the use of PROT_READ | PROT_WRITE | PROT_EXEC
in the call to mmap
. This program is compiled with
gcc A.c -lrt -o A
The constant fn_size
was determined by looking at the output of objdump -dj .text A
...
000000000000088a <foo>:
88a: 55 push %rbp
88b: 48 89 e5 mov %rsp,%rbp
88e: 89 7d ec mov %edi,-0x14(%rbp)
891: c7 45 fc 0e 00 00 00 movl $0xe,-0x4(%rbp)
898: 8b 55 fc mov -0x4(%rbp),%edx
89b: 8b 45 ec mov -0x14(%rbp),%eax
89e: 01 d0 add %edx,%eax
8a0: 5d pop %rbp
8a1: c3 retq
...
I think that's 24
bytes, I dunno. I guess I could put anything larger than that and it would do the same thing. Anything shorter and I'll probably get an exception from the processor. Also, note that the value of x
from foo
(14
, that's (apparently) 0e 00 00 00
in LE) is located at foo + 10
. This will be the constant x_offset
in program B
.
// B.c
#include <stdio.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
const int x_offset = 10;
int main(int argc, char *argv[]) {
// create shared memory objects
int shared_fd = shm_open("foobar2", O_RDWR | O_CREAT, 0777);
void *shared_mem = mmap(NULL, 0x1000, PROT_EXEC | PROT_WRITE, MAP_SHARED, shared_fd, 0);
int (*shared_foo)(int) = shared_mem;
int z = shared_foo(13);
printf("result: %d\n", z);
int *x_p = (int*)((char*)shared_mem + x_offset);
*x_p = 100;
shm_unlink("foobar");
}
Anyways first I run A
, then I run B
. The output of B
is:
result: 27
Then I go back to A
and push enter
, then I get:
shared_foo(3) = 103
Good enough for me.
To completely eliminate the mystique of all this, after running A
you can do something like
xxd /dev/shm/foobar2 | vim -
Then, edit that constant 0e 00 00 00
just like before, then save the file with the 'ol
:w !xxd -r > /dev/shm/foobar2
and push enter
in A
and see similar results as above.
Upvotes: 0
Reputation: 25599
Yes, you can do that, but the first process must have first created the shared memory via mmap
and either a memory-mapped file, or a shared area created with shm_open
.
If you are sharing compiled code then that's what shared libraries were created for. You can link against them in the ordinary way and the sharing will happen automatically, or you can load them manually using dlopen
(e.g. for a plugin).
Update:
As the code has been generated by a compiler then you will have relocations to worry about. The compiler does not produce code that will Just Work anywhere. It expects that the .data
section is in a certain place, and that the .bss
section has been zeroed. The GOT will need to be populated. Any static constructors will have to be called.
In short, what you want is probably dlopen
. This system allows you to open a shared library like it was a file, and then extract function pointers by name. Each program that dlopen
s the library will share the code sections, thus saving memory, but each will have its own copy of the data section, so they do not interfere with each other.
Beware that you need to compile your library code with -fPIC
or else you won't get any code sharing either (actually, the linkers and dynamic loaders for many architectures probably don't support libraries that aren't PIC anyway).
Upvotes: 5
Reputation: 129524
It would be possible to do so, but that's exactly what shared libraries are for.
Also, beware that you need to check that the address of the shared memory is the same for both processes, otherwise any references that are "absolute" (that is, a pointer to something in the shared code). And like with shared libaries, the bitness of the code will have to be the same, and as with all shared memory, you need to make sure that you don't "mess up" for the other process if you modify any of the shared memory.
Determining the size of a function ranges from "hard" to "nearly impossible", depending on the actual code generated, and the level of information you have available. Debug symbols will have the size of a function, but beware that I have seen compilers generate code where two functions share the same "return" piece of code (that is, the compiler generates a jump to another function that has the same bit of code to return the result, because it saves a few bytes of code, and there was already going to be a jump anyway [e.g. there is a if/else that the compiler has to jump around]).
Upvotes: 2
Reputation: 1
The standard approach is to put the code of f()
in a shared library libfoo.so
. Then you could either link to that library (e.g. by building program A with gcc -Wall a.c -lfoo -o a.bin
), or load it dynamically (e.g. in program B) using dlopen(3) then retrieving the address of f
using dlsym
.
When you compile a shared library you want to :
foo1.c
with gcc -Wall -fPIC -c foo1.c -o foo1.pic.o
into position independent code, and likewise for foo2.c
into foo2.pic.o
libfoo.so
with gcc -Wall -shared foo*.pic.o -o libfoo.so
; notice that you can link additional shared libraries into lbfoo.so
(e.g. by appending -lm
to the linking command)See also the Program Library Howto.
You could play insane tricks by mmap
-ing some other /proc/1234/mem
but that is not reasonable at all. Use shared libraries.
PS. you can dlopen
a big lot (hundreds of thousands) of shared objects lib*.so
files; you may want to dlclose
them (but practically you don't have to).
Upvotes: 4