Reputation: 15109
I'm doing a kernel module that intercepts kernel syscalls. Intercepting, or rather just replacing the real syscall address with a fake syscall address in plain C is as easy as 1-2-3. But I'd like to know how that works on low level.
(let's pretend I'm on x86)
First of all, I'm doing just a basic test: I'm kalloc
ating a small chunk of executable memory and filling it with this opcode:
0xB8, 0x00, 0x00, 0x00, 0x00, //mov eax, &real_syscall_function;
0xFF, 0xE0, //jmp eax;
Inserting the module and replacing the syscall works just perfect.
Now, according to this SO answer, arguments are passed in the registers. I want to check this, so I create an executable chunk of memory and fill it with this code:
0x55, //push ebp;
0x89, 0xE5, //mov ebp, esp;
0x83, 0xEC, 0x20, //sub esp, 32;
0xB8, 0x00, 0x00, 0x00, 0x00, //mov eax, &real_syscall_function;
0xFF, 0xE0, //jmp eax;
0x89, 0xEC, //mov esp, ebp;
0x5D, //pop ebp;
0xC3 //ret;
This should work too, as I'm not touching any of the registers, I'm just playing with the stack, but it doesn't work. That makes me think arguments are actually passed on the stack. But why? Am I understand the SO answer I linked to wrong? Aren't args supposed to be in the registers when a syscall is called?
Extra question: Why using jmp eax
works, but call eax
doesn't work? (This applies to both first and second example code).
Edit: I'm sorry, I missed a little bit the comments in the ASM code. What I'm jmp
ing to is the address of the real syscall function.
Edit 2: I think it's obvious, but anyways I'll explain it just in case somebody is not understanding what I'm doing. I'm allocating a small executable chunk of memory, filling it with the opcode I'm showing and then making a given syscall (let's say __NR_read
) point to the address of that executable chunk of memory.
works just perfect == system keeps running without problems. It means the real syscall is being called from the fake syscall
it doesn't work == system crashes because the fake syscall isn't calling the real syscall
Upvotes: 3
Views: 786
Reputation: 3690
Adding an example code to user708549 answer's above.
Consider the minimal code and naive C function that gets a file descriptor (int
) and reads from this file some amount of bytes (40 in this example).
// funcToCallReadSystemCall.c file's content
#include<unistd.h>
void callReadSystemCall(const int fileDescriptor)
{
int sz = 0;
char buff[64] = {0};
sz = read(fileDescriptor, buff, 40);
}
The assembly code (abbriviated) for this function will look like so:
funcToCallReadSystemCall.c:6: int sz = 0;
001a C745AC00 movl $0, -84(%rbp) #, sz
000000
funcToCallReadSystemCall.c:7: char buff[64] = {0};
0021 48C745B0 movq $0, -80(%rbp) #, buff
00000000
0029 48C745B8 movq $0, -72(%rbp) #, buff
00000000
0031 48C745C0 movq $0, -64(%rbp) #, buff
00000000
0039 48C745C8 movq $0, -56(%rbp) #, buff
00000000
0041 48C745D0 movq $0, -48(%rbp) #, buff
00000000
0049 48C745D8 movq $0, -40(%rbp) #, buff
00000000
0051 48C745E0 movq $0, -32(%rbp) #, buff
00000000
0059 48C745E8 movq $0, -24(%rbp) #, buff
00000000
funcToCallReadSystemCall.c:8: sz = read(fileDescriptor, buff, 40);
0061 488D4DB0 leaq -80(%rbp), %rcx #, tmp88
0065 8B459C movl -100(%rbp), %eax # fileDescriptor, tmp89
0068 BA280000 movl $40, %edx #,
00
006d 4889CE movq %rcx, %rsi # tmp88,
0070 89C7 movl %eax, %edi # tmp89,
0072 E8000000 call read@PLT #
00
0077 8945AC movl %eax, -84(%rbp) # _1, sz
Notes about the above assembly code:
movl $0, -84(%rbp) #, sz
– this line set sz as zero. Note that its location is -84 bytes down the base stack pointer
movq $0, -80(%rbp) #, buff
– these lines set the buff buffer of chars to zero. The buff char array starts at location -80 bytes down the base stack pointer and “lasts” for 64 bytes.
leaq -80(%rbp), %rcx #, tmp88
– this line loads the address (location) of the buff pointer into rcx register.
movl -100(%rbp), %eax # fileDescriptor, tmp89
– this line copies the content of the funtion’s argument fileDescriptor into eax register.
movl $40, %edx #,
– this line of code copies (moves) the value 40 into the edx register.
movq %rcx, %rsi # tmp88,
– this line copies (moves) the contenxt of rcx register into rsi register.
movl %eax, %edi # tmp89,
– this line of code copies (moves) the value of eax register into the edi register. Recall, that eax was set before with the value of the file descriptor number.
At this point, there are the following values:
edi (rdi)
: the file descriptor value (integer)
rsi
: the pointer to the buffer into which the read will be done
edx (rdx)
: the amount of bytes to read.
–> According to the x86_64 system call calling convention, the six arguments to a system call will be passed in the following registers: rdi
, rsi
, rdx
, r10
, r8
, r9
in this order.
call read@PLT #
– this line has the x86_64 architecture instruction that is used to call the read()
function from the Procedure Linkage Table (PLT). The PLT is a table of function stubs that are used to call functions in shared libraries.
movl %eax, -84(%rbp) # _1, sz
– this lines copies (moves) the value from the eax
register into the sz
location. This is due to the fact that on x86_64 architecture system call calling convention, the return value is passed via the eax
register.
Note that, the reason the arguments are passed within (dedicated) registers, is cause this trasintion is causing the code to be executed from within user-mode into kernel-mode, and kernel mode stack is NOT “concatanted” to the user-mode stack.
The command I used to generate the above assembly code was:
gcc -g -O0 -c -fverbose-asm -Wa,-adhln funcToCallReadSystemCall.c > funcToCallReadSystemCall.lst
Where funcToCallReadSystemCall.c
is the source code that contains the above C function and the output will be written into the funcToCallReadSystemCall.lst (text) file.
Upvotes: 1
Reputation:
Syscall params are first passed from userspace via registers to system_call()
function which is in essence a common syscall dispatcher. However system_call()
then calls real system call functions such as sys_read()
in a usual manner, passing parameters via stack. Therefore, messing up with the stack leads to crash.
Also, see this SO answer: https://stackoverflow.com/a/10459713 and very detailed explanation on quora: http://www.quora.com/Linux-Kernel/What-does-asmlinkage-mean-in-the-definition-of-system-calls#step=6 (registration required).
Upvotes: 1