Reputation: 1369
I'm messing with function pointers and what I'm trying to do is to duplicate a function in memory, so for that I'm using mmap
to be able to execute my memory, then memcpy
to copy the function in my new pointer, what I want to acheive by doing that is to create a new instance of the static variable inside of my function. The thing is I did something wrong when copying the function, because when I try to call the function, it segfaults. Here is the code:
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/mman.h>
int g_pagesize;
void foo(void)
{
static int i = 0;
i++;
printf("%d\n", i);
}
void bar(void)
{
printf("some stuff\n");
}
int main(void)
{
void (*fp1)(void), (*fp2)(void);
g_pagesize = getpagesize();
fp1 = mmap(NULL, g_pagesize, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
fp2 = mmap(NULL, g_pagesize, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
memcpy(fp1, foo, bar - foo);
memcpy(fp2, foo, bar - foo);
for (int i = 0; i < 5; i++)
(*fp1)();
for (int i = 0; i < 5; i++)
(*fp2)();
return (0);
}
/*
** Wanted output:
** 1
** 2
** 3
** 4
** 5
** 1
** 2
** 3
** 4
** 5
*/
Upvotes: 1
Views: 154
Reputation: 4320
As noted in the comments this won't work, as there is only one instance of the i
variable, stored in the .bss
segment which is the case for all static
variables. Thus, even if the code for foo
were to be copied in such a way that it could run without causing a segmentation fault, it would simply increment from 1 through 10.
The more interesting part of this question are those segmentation faults, and why they occur.
To consider what's happening, let's look at some (relatively unoptimised) x86-64 assembly output for the foo
function, with the original source interleaved:
00000000000011c9 <foo>:
# void foo(void)
# {
11c9: f3 0f 1e fa endbr64
11cd: 55 push rbp
11ce: 48 89 e5 mov rbp,rsp
# static int i = 0;
#
# i++;
11d1: 8b 05 3d 2e 00 00 mov eax,DWORD PTR [rip+0x2e3d] # 4014 <i.3666>
11d7: 83 c0 01 add eax,0x1
11da: 89 05 34 2e 00 00 mov DWORD PTR [rip+0x2e34],eax # 4014 <i.3666>
# printf("%d\n", i);
11e0: 8b 05 2e 2e 00 00 mov eax,DWORD PTR [rip+0x2e2e] # 4014 <i.3666>
11e6: 89 c6 mov esi,eax
11e8: 48 8d 3d 15 0e 00 00 lea rdi,[rip+0xe15] # 2004 <_IO_stdin_used+0x4>
11ef: b8 00 00 00 00 mov eax,0x0
11f4: e8 b7 fe ff ff call 10b0 <printf@plt>
# }
11f9: 90 nop
11fa: 5d pop rbp
11fb: c3 ret
At address 0x11d1
the relative form of the MOV
instruction is used to load the current value of i
into the eax
register. As you can see, the mnemonic is of the form:
mov eax,DWORD PTR [rip+<offset>]
which can be read as:
rip
) plus the provided offset, thenDWORD
) stored there into the eax
registerThis offset value (0x2e3d
) is calculated at compile time so that when the instruction is executed, rip
is precisely 0x2e3d
less than the location in memory of the static i
variable. When making a copy of foo
in a dynamic location with mmap
and memcpy
as you have done, the copied offset remains the same, but the value of rip
when that instruction is executed will be significantly different.
More importantly, the calculated address of rip+0x2e3d
may end up in a region of memory not marked as available for the process, and accessing this address will cause the segmentation fault. Subsequent accesses at 0x11da
and 0x11e0
will cause the same issue.
Continuing on, even if you were to change the static variable i
to a local stack variable, you would still run into issues, as at address 0x11f4
the relative form of the CALL
instruction is used to invoke printf
, with an offset of 0x10b0
, which will also cause a segmentation fault, for the same reason.
If you were able to rectify that, there is one more relative addressing calculation in the foo
function, at address 0x11e8
, which uses the LEA
instruction to calculate the address of the "%d\n"
format string. The calculation itself is fine, however any access of that calculated address will likely also cause a segmentation fault.
Upvotes: 3