alexandernst
alexandernst

Reputation: 15109

Intercepting syscalls (where are args passed)

I'm doing a kernel module that intercepts kernel syscalls. Intercepting, or rather just replacing the real syscall address with a fake syscall address in plain C is as easy as 1-2-3. But I'd like to know how that works on low level.

(let's pretend I'm on x86)

First of all, I'm doing just a basic test: I'm kallocating a small chunk of executable memory and filling it with this opcode:

0xB8, 0x00, 0x00, 0x00, 0x00,          //mov eax, &real_syscall_function;
0xFF, 0xE0,                            //jmp eax;

Inserting the module and replacing the syscall works just perfect.

Now, according to this SO answer, arguments are passed in the registers. I want to check this, so I create an executable chunk of memory and fill it with this code:

0x55,                                  //push ebp;
0x89, 0xE5,                            //mov ebp, esp;
0x83, 0xEC, 0x20,                      //sub esp, 32; 

0xB8, 0x00, 0x00, 0x00, 0x00,          //mov eax, &real_syscall_function;
0xFF, 0xE0,                            //jmp eax;

0x89, 0xEC,                            //mov esp, ebp;
0x5D,                                  //pop ebp;
0xC3                                   //ret;

This should work too, as I'm not touching any of the registers, I'm just playing with the stack, but it doesn't work. That makes me think arguments are actually passed on the stack. But why? Am I understand the SO answer I linked to wrong? Aren't args supposed to be in the registers when a syscall is called?

Extra question: Why using jmp eax works, but call eax doesn't work? (This applies to both first and second example code).

Edit: I'm sorry, I missed a little bit the comments in the ASM code. What I'm jmping to is the address of the real syscall function.

Edit 2: I think it's obvious, but anyways I'll explain it just in case somebody is not understanding what I'm doing. I'm allocating a small executable chunk of memory, filling it with the opcode I'm showing and then making a given syscall (let's say __NR_read) point to the address of that executable chunk of memory.


works just perfect == system keeps running without problems. It means the real syscall is being called from the fake syscall

it doesn't work == system crashes because the fake syscall isn't calling the real syscall

Upvotes: 3

Views: 786

Answers (2)

Guy Avraham
Guy Avraham

Reputation: 3690

Adding an example code to user708549 answer's above.

Consider the minimal code and naive C function that gets a file descriptor (int) and reads from this file some amount of bytes (40 in this example).

// funcToCallReadSystemCall.c file's content
#include<unistd.h>

void callReadSystemCall(const int fileDescriptor)
{
    int sz = 0;
    char buff[64] = {0};
    sz = read(fileDescriptor, buff, 40);
}

The assembly code (abbriviated) for this function will look like so:

funcToCallReadSystemCall.c:6:   int sz = 0;
001a C745AC00       movl    $0, -84(%rbp)   #, sz
  000000
funcToCallReadSystemCall.c:7:   char buff[64] = {0};
0021 48C745B0       movq    $0, -80(%rbp)   #, buff
  00000000 
0029 48C745B8       movq    $0, -72(%rbp)   #, buff
  00000000 
0031 48C745C0       movq    $0, -64(%rbp)   #, buff
  00000000 
0039 48C745C8       movq    $0, -56(%rbp)   #, buff
  00000000 
0041 48C745D0       movq    $0, -48(%rbp)   #, buff
  00000000 
0049 48C745D8       movq    $0, -40(%rbp)   #, buff
  00000000 
0051 48C745E0       movq    $0, -32(%rbp)   #, buff
  00000000 
0059 48C745E8       movq    $0, -24(%rbp)   #, buff
  00000000 
funcToCallReadSystemCall.c:8:   sz = read(fileDescriptor, buff, 40);
0061 488D4DB0       leaq    -80(%rbp), %rcx #, tmp88
0065 8B459C         movl    -100(%rbp), %eax    # fileDescriptor, tmp89
0068 BA280000       movl    $40, %edx   #,
  00
006d 4889CE         movq    %rcx, %rsi  # tmp88,
0070 89C7           movl    %eax, %edi  # tmp89,
0072 E8000000       call    read@PLT    #
  00
0077 8945AC         movl    %eax, -84(%rbp) # _1, sz

Notes about the above assembly code:

movl $0, -84(%rbp) #, sz – this line set sz as zero. Note that its location is -84 bytes down the base stack pointer

movq $0, -80(%rbp) #, buff – these lines set the buff buffer of chars to zero. The buff char array starts at location -80 bytes down the base stack pointer and “lasts” for 64 bytes.

leaq -80(%rbp), %rcx #, tmp88 – this line loads the address (location) of the buff pointer into rcx register.

movl -100(%rbp), %eax # fileDescriptor, tmp89 – this line copies the content of the funtion’s argument fileDescriptor into eax register.

movl $40, %edx #, – this line of code copies (moves) the value 40 into the edx register.

movq %rcx, %rsi # tmp88, – this line copies (moves) the contenxt of rcx register into rsi register.

movl %eax, %edi # tmp89, – this line of code copies (moves) the value of eax register into the edi register. Recall, that eax was set before with the value of the file descriptor number.

At this point, there are the following values: edi (rdi): the file descriptor value (integer)

rsi: the pointer to the buffer into which the read will be done

edx (rdx): the amount of bytes to read. –> According to the x86_64 system call calling convention, the six arguments to a system call will be passed in the following registers: rdi, rsi, rdx, r10, r8, r9 in this order.

call read@PLT # – this line has the x86_64 architecture instruction that is used to call the read() function from the Procedure Linkage Table (PLT). The PLT is a table of function stubs that are used to call functions in shared libraries.

movl %eax, -84(%rbp) # _1, sz – this lines copies (moves) the value from the eax register into the sz location. This is due to the fact that on x86_64 architecture system call calling convention, the return value is passed via the eax register.

Note that, the reason the arguments are passed within (dedicated) registers, is cause this trasintion is causing the code to be executed from within user-mode into kernel-mode, and kernel mode stack is NOT “concatanted” to the user-mode stack.

The command I used to generate the above assembly code was:

gcc -g -O0 -c -fverbose-asm -Wa,-adhln funcToCallReadSystemCall.c > funcToCallReadSystemCall.lst

Where funcToCallReadSystemCall.c is the source code that contains the above C function and the output will be written into the funcToCallReadSystemCall.lst (text) file.

Upvotes: 1

user708549
user708549

Reputation:

Syscall params are first passed from userspace via registers to system_call() function which is in essence a common syscall dispatcher. However system_call() then calls real system call functions such as sys_read() in a usual manner, passing parameters via stack. Therefore, messing up with the stack leads to crash. Also, see this SO answer: https://stackoverflow.com/a/10459713 and very detailed explanation on quora: http://www.quora.com/Linux-Kernel/What-does-asmlinkage-mean-in-the-definition-of-system-calls#step=6 (registration required).

Upvotes: 1

Related Questions