Fayeure
Fayeure

Reputation: 1369

Duplicating a function containing static variables in memory

I'm messing with function pointers and what I'm trying to do is to duplicate a function in memory, so for that I'm using mmap to be able to execute my memory, then memcpy to copy the function in my new pointer, what I want to acheive by doing that is to create a new instance of the static variable inside of my function. The thing is I did something wrong when copying the function, because when I try to call the function, it segfaults. Here is the code:

#include <string.h>
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/mman.h>

int     g_pagesize;

void    foo(void)
{
    static int i = 0;

    i++;
    printf("%d\n", i);
}

void    bar(void)
{
    printf("some stuff\n");
}

int main(void)
{
    void    (*fp1)(void), (*fp2)(void);

    g_pagesize = getpagesize();
    fp1 = mmap(NULL, g_pagesize, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    fp2 = mmap(NULL, g_pagesize, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    memcpy(fp1, foo, bar - foo);
    memcpy(fp2, foo, bar - foo);
    for (int i = 0; i < 5; i++)
        (*fp1)();
    for (int i = 0; i < 5; i++)
        (*fp2)();
    return (0);
}

/*
** Wanted output:
** 1
** 2
** 3
** 4
** 5
** 1
** 2
** 3
** 4
** 5
*/

Upvotes: 1

Views: 154

Answers (1)

msbit
msbit

Reputation: 4320

As noted in the comments this won't work, as there is only one instance of the i variable, stored in the .bss segment which is the case for all static variables. Thus, even if the code for foo were to be copied in such a way that it could run without causing a segmentation fault, it would simply increment from 1 through 10.

The more interesting part of this question are those segmentation faults, and why they occur.

To consider what's happening, let's look at some (relatively unoptimised) x86-64 assembly output for the foo function, with the original source interleaved:

00000000000011c9 <foo>:
# void    foo(void)
# {
    11c9:       f3 0f 1e fa             endbr64
    11cd:       55                      push   rbp
    11ce:       48 89 e5                mov    rbp,rsp
#     static int i = 0;
#
#     i++;
    11d1:       8b 05 3d 2e 00 00       mov    eax,DWORD PTR [rip+0x2e3d]        # 4014 <i.3666>
    11d7:       83 c0 01                add    eax,0x1
    11da:       89 05 34 2e 00 00       mov    DWORD PTR [rip+0x2e34],eax        # 4014 <i.3666>
#     printf("%d\n", i);
    11e0:       8b 05 2e 2e 00 00       mov    eax,DWORD PTR [rip+0x2e2e]        # 4014 <i.3666>
    11e6:       89 c6                   mov    esi,eax
    11e8:       48 8d 3d 15 0e 00 00    lea    rdi,[rip+0xe15]        # 2004 <_IO_stdin_used+0x4>
    11ef:       b8 00 00 00 00          mov    eax,0x0
    11f4:       e8 b7 fe ff ff          call   10b0 <printf@plt>
# }
    11f9:       90                      nop
    11fa:       5d                      pop    rbp
    11fb:       c3                      ret

At address 0x11d1 the relative form of the MOV instruction is used to load the current value of i into the eax register. As you can see, the mnemonic is of the form:

mov    eax,DWORD PTR [rip+<offset>]

which can be read as:

  • calculate an address based on the current instruction pointer (rip) plus the provided offset, then
  • copy the 32 bit value (DWORD) stored there into the eax register

This offset value (0x2e3d) is calculated at compile time so that when the instruction is executed, rip is precisely 0x2e3d less than the location in memory of the static i variable. When making a copy of foo in a dynamic location with mmap and memcpy as you have done, the copied offset remains the same, but the value of rip when that instruction is executed will be significantly different.

More importantly, the calculated address of rip+0x2e3d may end up in a region of memory not marked as available for the process, and accessing this address will cause the segmentation fault. Subsequent accesses at 0x11da and 0x11e0 will cause the same issue.

Continuing on, even if you were to change the static variable i to a local stack variable, you would still run into issues, as at address 0x11f4 the relative form of the CALL instruction is used to invoke printf, with an offset of 0x10b0, which will also cause a segmentation fault, for the same reason.

If you were able to rectify that, there is one more relative addressing calculation in the foo function, at address 0x11e8, which uses the LEA instruction to calculate the address of the "%d\n" format string. The calculation itself is fine, however any access of that calculated address will likely also cause a segmentation fault.

Upvotes: 3

Related Questions