Zbigh1
Zbigh1

Reputation: 379

Linux: is it possible to share code between processes?

I wonder if it's possible for a linux process to call code located in the memory of another process?

Let's say we have a function f() in process A and we want process B to call it. What I thought about is using mmap with MAP_SHARED and PROT_EXEC flags to map the memory containing the function code and pass the pointer to B, assuming, that f() will not call any other function from A binary. Will it ever work? If yes, then how do I determine the size of f() in memory?

=== EDIT ===

I know, that shared libraries will do exactly that, but I wonder if it's possible to dynamically share code between processes.

Upvotes: 4

Views: 1855

Answers (4)

Nathan Chappell
Nathan Chappell

Reputation: 2446

  • not directly
  • that's what shared libraries are for
  • relocations

Oh no! Anyways...

Here's the insane, unreasonable, not-good, purely academic demonstration of this capability. It was fun for me, I hope it's fun for you.

Overview

Program A will use shm_open to create a shared memory object, and mmap to map it to its memory space. Then it it will copy some code from a function defined in A to the shared memory. Then program B will open up the shared memory, execute the function, and just for kicks, make a very simple modification to the code. Then A will execute the code to demonstrate the change took effect.

Again, this is no recommendation for how to solve a problem, it's an academic demonstration.

// A.c
#include <stdio.h>
#include <string.h>

#include <unistd.h>

#include <fcntl.h>
#include <sys/mman.h>
#include <sys/stat.h>

int foo(int y) {
  int x = 14;
  return x + y;
}

int main(int argc, char *argv[]) {
  const size_t mem_size = 0x1000;
  // create shared memory objects
  int shared_fd = shm_open("foobar2", O_RDWR | O_CREAT, 0777);
  ftruncate(shared_fd, mem_size);
  void *shared_mem =
      mmap(NULL, mem_size, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_SHARED, shared_fd, 0);
  // copy function to shared memory
  const size_t fn_size = 24;
  memcpy(shared_mem, &foo, fn_size);
  // wait
  getc(stdin);
  // execute the shared function
  int(*shared_foo)(int) = shared_mem;
  printf("shared_foo(3) = %d\n", shared_foo(3));
  // clean up
  shm_unlink("foobar2");
}

Note the use of PROT_READ | PROT_WRITE | PROT_EXEC in the call to mmap. This program is compiled with

gcc A.c -lrt -o A

The constant fn_size was determined by looking at the output of objdump -dj .text A

...
000000000000088a <foo>:
 88a:   55                      push   %rbp
 88b:   48 89 e5                mov    %rsp,%rbp
 88e:   89 7d ec                mov    %edi,-0x14(%rbp)
 891:   c7 45 fc 0e 00 00 00    movl   $0xe,-0x4(%rbp)
 898:   8b 55 fc                mov    -0x4(%rbp),%edx
 89b:   8b 45 ec                mov    -0x14(%rbp),%eax
 89e:   01 d0                   add    %edx,%eax
 8a0:   5d                      pop    %rbp
 8a1:   c3                      retq   
...

I think that's 24 bytes, I dunno. I guess I could put anything larger than that and it would do the same thing. Anything shorter and I'll probably get an exception from the processor. Also, note that the value of x from foo (14, that's (apparently) 0e 00 00 00 in LE) is located at foo + 10. This will be the constant x_offset in program B.

// B.c
#include <stdio.h>

#include <unistd.h>

#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>

const int x_offset = 10;

int main(int argc, char *argv[]) {
  // create shared memory objects
  int shared_fd = shm_open("foobar2", O_RDWR | O_CREAT, 0777);
  void *shared_mem = mmap(NULL, 0x1000, PROT_EXEC | PROT_WRITE, MAP_SHARED, shared_fd, 0);
  int (*shared_foo)(int) = shared_mem;
  int z = shared_foo(13);
  printf("result: %d\n", z);
  int *x_p = (int*)((char*)shared_mem + x_offset);
  *x_p = 100;
  shm_unlink("foobar");
}

Anyways first I run A, then I run B. The output of B is:

result: 27

Then I go back to A and push enter, then I get:

shared_foo(3) = 103

Good enough for me.

/dev/shm/foobar2

To completely eliminate the mystique of all this, after running A you can do something like

xxd /dev/shm/foobar2 | vim -

Then, edit that constant 0e 00 00 00 just like before, then save the file with the 'ol

:w !xxd -r > /dev/shm/foobar2

and push enter in A and see similar results as above.

Upvotes: 0

ams
ams

Reputation: 25599

Yes, you can do that, but the first process must have first created the shared memory via mmap and either a memory-mapped file, or a shared area created with shm_open.

If you are sharing compiled code then that's what shared libraries were created for. You can link against them in the ordinary way and the sharing will happen automatically, or you can load them manually using dlopen (e.g. for a plugin).


Update:

As the code has been generated by a compiler then you will have relocations to worry about. The compiler does not produce code that will Just Work anywhere. It expects that the .data section is in a certain place, and that the .bss section has been zeroed. The GOT will need to be populated. Any static constructors will have to be called.

In short, what you want is probably dlopen. This system allows you to open a shared library like it was a file, and then extract function pointers by name. Each program that dlopens the library will share the code sections, thus saving memory, but each will have its own copy of the data section, so they do not interfere with each other.

Beware that you need to compile your library code with -fPIC or else you won't get any code sharing either (actually, the linkers and dynamic loaders for many architectures probably don't support libraries that aren't PIC anyway).

Upvotes: 5

Mats Petersson
Mats Petersson

Reputation: 129524

It would be possible to do so, but that's exactly what shared libraries are for.

Also, beware that you need to check that the address of the shared memory is the same for both processes, otherwise any references that are "absolute" (that is, a pointer to something in the shared code). And like with shared libaries, the bitness of the code will have to be the same, and as with all shared memory, you need to make sure that you don't "mess up" for the other process if you modify any of the shared memory.

Determining the size of a function ranges from "hard" to "nearly impossible", depending on the actual code generated, and the level of information you have available. Debug symbols will have the size of a function, but beware that I have seen compilers generate code where two functions share the same "return" piece of code (that is, the compiler generates a jump to another function that has the same bit of code to return the result, because it saves a few bytes of code, and there was already going to be a jump anyway [e.g. there is a if/else that the compiler has to jump around]).

Upvotes: 2

The standard approach is to put the code of f() in a shared library libfoo.so. Then you could either link to that library (e.g. by building program A with gcc -Wall a.c -lfoo -o a.bin), or load it dynamically (e.g. in program B) using dlopen(3) then retrieving the address of f using dlsym.

When you compile a shared library you want to :

  • compile each source file foo1.c with gcc -Wall -fPIC -c foo1.c -o foo1.pic.o into position independent code, and likewise for foo2.c into foo2.pic.o
  • link all of them into libfoo.so with gcc -Wall -shared foo*.pic.o -o libfoo.so ; notice that you can link additional shared libraries into lbfoo.so (e.g. by appending -lm to the linking command)

See also the Program Library Howto.

You could play insane tricks by  mmap-ing some other /proc/1234/mem but that is not reasonable at all. Use shared libraries.

PS. you can dlopen a big lot (hundreds of thousands) of shared objects lib*.sofiles; you may want to dlclosethem (but practically you don't have to).

Upvotes: 4

Related Questions