dubbaluga
dubbaluga

Reputation: 2343

Writing a custom loader in C and assembly for x64 on Linux

I'd like to write my own loader for binary code on x64 Linux. In the future I want to be able to perform the linking step myself and thus be able to call code from .o object-files. But now, I want to call a function from an executable binary that has already been linked.

To create some function that should be callable from "outside", I started with the following piece of source code:

void foo(void)
{
  int a = 2;
  int b = 3;
  a + b;
}

int main(void)
{
  foo();
  return 0;
}

It's the foo()-function I want to call using my loader. Using the following chain of commands

gcc -o /tmp/main main.c
strip -s /tmp/main
objdump -D /tmp/main

I obtained the assembly code of the foo() function, which looks like this:

...
0000000000001125 <foo>:
    1125:   55                      push   %rbp
    1126:   48 89 e5                mov    %rsp,%rbp
    1129:   c7 45 fc 02 00 00 00    movl   $0x2,-0x4(%rbp)
    1130:   c7 45 f8 03 00 00 00    movl   $0x3,-0x8(%rbp)
    1137:   90                      nop
    1138:   5d                      pop    %rbp
    1139:   c3                      retq
...

That means, that the foo() function starts at offset 0x1125 in main. I verified this using a hexeditor.

The following is my loader. There is no error handling yet and the code is very ugly. However, it should demonstrate, what I want to achieve:

#include <stdio.h>
#include <stdlib.h>

typedef void(*voidFunc)(void);

int main(int argc, char* argv[])
{
  FILE *fileptr;
  char *buffer;
  long filelen;
  voidFunc mainFunc;

  fileptr = fopen(argv[1], "rb");  // Open the file in binary mode
  fseek(fileptr, 0, SEEK_END);          // Jump to the end of the file
  filelen = ftell(fileptr);             // Get the current byte offset in the file
  rewind(fileptr);                      // Jump back to the beginning of the file

  buffer = (char *)malloc((filelen+1)*sizeof(char)); // Enough memory for file + \0
  fread(buffer, filelen, 1, fileptr); // Read in the entire file
  fclose(fileptr); // Close the file

  mainFunc = ((voidFunc)(buffer + 0x1125));

  mainFunc();

  free(buffer);

  return 0;
}

When executing this program objloader /tmp/main it results in a SEGFAULT.

The mainFunc variable points to the correct place. I verified this using gdb.

Is it a problem that the opcode lives on the heap? Actually I decided to make the function I want to call as simple as possible (side-effects, required stack or registers for function parameters, ...). But still, there is something, I don't really get.

Can anyone please point me to the right direction here? Any hints on helpful literature in that regard are also highly appreciated!

Upvotes: 3

Views: 2251

Answers (2)

dubbaluga
dubbaluga

Reputation: 2343

This is the final version of my 'loader' which is based on Nicholas Pipiton's answer. Again: no error-handling, simplified, not considering, that real-world scenarios are much more difficult, etc.:

#include <fcntl.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>

#include <stdlib.h>

typedef void(*voidFunc)(void);

int main(int argc, char* argv[])
{
  char* buffer;
  voidFunc mainFunc;
  struct stat myfilestats;
  int fd;

  fd = open(argv[1], O_RDONLY);
  fstat(fd, &myfilestats);
  buffer = mmap(NULL, myfilestats.st_size, PROT_EXEC, MAP_PRIVATE, fd, 0);
  close(fd);

  mainFunc = ((voidFunc)(buffer + 0x1125));

  mainFunc();

  munmap(buffer, myfilestats.st_size);

  return EXIT_SUCCESS;
}

Upvotes: 1

Nicholas Pipitone
Nicholas Pipitone

Reputation: 4192

In order to make the buffer memory region executable, you will have to use mmap. Try

#include <sys/mman.h>
...
buffer = (char *)mmap(NULL, filelen /* + 1? Not sure why. */, PROT_EXEC | PROT_WRITE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);

That should give the memory region the permissions you want and have it work with the surrounding code. In fact, if you want to use mmap the way it was meant to be used, go for

int fd = open(argv[1], O_RDONLY);
struct stat myfilestats;
fstat(fd, &myfilestats);
buffer = (char*)mmap(NULL, myfilestats.st_size, PROT_EXEC, MAP_PRIVATE, fd, 0);
fclose(fd);
...
munmap(buffer, myfilestats.st_size);

Using MAP_ANONYMOUS will make the memory region unassociated with a file descriptor, but the idea is that if it represents a file, the file descriptor should be associated with it. When you do this Linux will do all kinds of cool tricks, such as only load parts of the file that you actually end up accessing (lazy loading will also make the program very smooth when the file is large), and if multiple programs are all accessing the same file then they will all share the same physical memory location.

Upvotes: 5

Related Questions