Playing with syscall table from LKM

Question

I'm overriding SYS_READ from the syscall table in Linux (3.x) but I'm having some troubles when unloading the module itself. I first load my module which finds the syscall table, then enables RW, overrides SYS_READ with my own SYS_READ function (which in fact doesn't do anything else than calling the original SYS_READ), then I wait a few moments, and then unload the module. On the unload method of my module I restore the original SYS_READ function back in the syscall table and set back the syscall table to RO.

The original SYS_READ function is restored properly, but I get this when I unload the module: http://pastebin.com/JyYpqYgL

What am I missing? Should I be doing something more after restoring the real SYS_READ ?

EDIT: GitHub link to the project: https://github.com/alexandernst/procmon

EDIT:

This is how I get the syscall table address:

void **sys_call_table;

struct idt_descriptor{
    unsigned short offset_low;
    unsigned short selector;
    unsigned char zero;
    unsigned char type_flags;
    unsigned short offset_high;
} __attribute__ ((packed));


struct idtr{
    unsigned short limit;
    void *base;
} __attribute__ ((packed));


void *get_sys_call_table(void){
    struct idtr idtr;
    struct idt_descriptor idtd;
    void *system_call;
    unsigned char *ptr;
    int i;

    asm volatile("sidt %0" : "=m" (idtr));
    memcpy(&idtd, idtr.base + 0x80 * sizeof(idtd), sizeof(idtd));
    system_call = (void*)((idtd.offset_high<<16) | idtd.offset_low);
    for(ptr=system_call, i=0; i<500; i++){
        if(ptr[0] == 0xff && ptr[1] == 0x14 && ptr[2] == 0x85)
            return *((void**)(ptr+3));
        ptr++;
    }

    return NULL;
}

sys_call_table = get_sys_call_table();

And this is how I set RW/RO:

unsigned long set_rw_cr0(void){
    unsigned long cr0 = 0;
    unsigned long ret;
    asm volatile("movq %%cr0, %%rax" : "=a"(cr0));
    ret = cr0;
    cr0 &= 0xfffffffffffeffff;
    asm volatile("movq %%rax, %%cr0" : : "a"(cr0));
    return ret;
}

void set_ro_cr0(unsigned long val){
    asm volatile("movq %%rax, %%cr0" : : "a"(val));
}

Finally, this is how I define my syscalls and change the syscall table:

asmlinkage ssize_t (*real_sys_read)(unsigned int fd, char __user *buf, size_t count);
asmlinkage ssize_t hooked_sys_read(unsigned int fd, char __user *buf, size_t count);

//set my syscall
real_sys_read = (void *)sys_call_table[__NR_read];
sys_call_table[__NR_read] = (void *)hooked_sys_read;

//restore real syscall
sys_call_table[__NR_read] = (void *)real_sys_read;

Ilya Matveychikov · Accepted Answer

If you wish to unload the module that intercepts system calls aware of the situations when some process still in system call handler and your code (module's text segment) goes away from the memory. That leads to page fault as when the process returns from some kernel function (that sleeps) into your code the code doesn't exists anymore.

So, correct module unloading scheme must check for the processess that may sleeps in hooked system calls. Unloading possible only if there are no one process that sleeps in the syscall hook.

UPD

Please, see the patch that proves my theory. It adds the atomic counter that increments and decrements when the hooked_sys_read calls. So as I supposed there is a process that still waiting in read_sys_read while you module have been unloaded. This patch show that with the printk(read_counter) and it prints 1 for me which means that someone doesn't decrement the read_counter.

http://pastebin.com/1yLBuMDY

Playing with syscall table from LKM

Answers (2)

Related Questions