user25873577
user25873577

Reputation: 41

Task switch during fork() causes random stack errors

I'm working on multitasking for my OS, more specifically fork() for usermode ELF-binaries (they call it by a syscall).

After calling fork, it manages to return to the parent process (sometimes) but on each PIT tick the code tries to switch to the next process, which crashes. It appears the crash is related to some stack corruption, but I genuinely have spent 3 days trying to figure it out.

Here's the fork function:

// fork() - Fork the current process and creates a child process. Will return the PID of the child to the parent, and 0 to the child.
pid_t fork() {
    // We don't need any interruptions
    disableHardwareInterrupts();

    serialPrintf("Beginning fork\n");

    uintptr_t sp, bp;
    process_t *parent = currentProcess;
    
    // Clone the page directory and set it up
    pagedirectory_t *directory = cloneKernelSpace2(parent->thread.page_directory);

    // Create a new process and setup its thread page directory
    process_t *new_proc = spawn_process(parent, 0);
    new_proc->thread.page_directory = directory;
    new_proc->thread.refcount = 1;
    new_proc->thread.pd_lock = spinlock_init();
 
    // Create system call registers
    registers_t r;
    memcpy(&r, parent->syscall_registers, sizeof(registers_t));

    // Setup SP and BP
    sp = new_proc->image.stack;
    bp = sp;

    serialPrintf("fork() pushing system call registers to address 0x%x\n", sp);

    // Setup return value in syscall registers
    r.eax = 0;

    // Push it onto the stack, manually because PUSH() doesn't seem to work right now.
    sp -= sizeof(registers_t); // Stack grows downwards
    memcpy(sp, &r, sizeof(registers_t));
    

    // Setup registers and context
    new_proc->syscall_registers = &r;
    new_proc->thread.context.sp = sp;
    new_proc->thread.context.bp = bp;
    new_proc->thread.context.tls_base = parent->thread.context.tls_base;

    new_proc->thread.context.ip = (uint32_t)&resume_usermode;

    // Validate syscall registers because this code is stupid

    new_proc->isChild = true;
    parent->isChild = false;

    serialPrintf("fork: Thread forked. IP=0x%x.\n", new_proc->thread.context.ip);

    // Save the context to the parent, but ONLY copy the saved part of the context over.
    save_context(&parent->thread.context);
    memcpy(new_proc->thread.context.saved, parent->thread.context.saved, sizeof(parent->thread.context.saved));

    serialPrintf("parent SP=0x%x BP=0x%x IP=0x%x\n", parent->thread.context.sp, parent->thread.context.bp, parent->thread.context.ip);

    if (parent->flags & PROCESS_FLAG_IS_TASKLET) new_proc->flags |= PROCESS_FLAG_IS_TASKLET;
    makeProcessReady(new_proc);

    enableHardwareInterrupts();
    return new_proc->id;
}

The important parts that are stack-related are where it pushes the currentProcess (parent process') system call registers (which are set by my syscall handler) onto the new process' stack (allocated in spawn_process like kmalloc(KSTACK_SIZE) + KSTACK_SIZE because stack grows downward)

The registers seem to be restored, but then a random error (commonly a page fault) appears after the switch.

Here is process_switchNext, which gets the next process and jumps to it. This is not called directly, it is called by the next function

void process_switchNext() {
    previousProcess = currentProcess;
    updateProcessTimes();

    // Get the next avialable process, discarding anything marked as finished
    do {
        currentProcess = process_getNextReadyProcess();
    } while (currentProcess->flags & PROCESS_FLAG_FINISHED);


    currentProcess->time_in = clock_getTimer();
    currentProcess->time_switch = currentProcess->time_in;



    // Restore paging and task switch context
    spinlock_lock(&switch_lock);
    vmm_switchDirectory(currentProcess->thread.page_directory);
    spinlock_release(&switch_lock);
    
    setKernelStack(currentProcess->image.stack);

    if (currentProcess->flags & PROCESS_FLAG_FINISHED) {
        panic("scheduler", "process_switchNext", "Process is marked finished, we should not have this process");
    }

    // Mark the process as running or started
    __sync_or_and_fetch(&currentProcess->flags, PROCESS_FLAG_STARTED);

    asm volatile ("" ::: "memory");
    
    uint32_t esp = currentProcess->thread.context.sp;
    uint32_t eip = currentProcess->thread.context.ip;
    uint32_t ebp = currentProcess->thread.context.bp;

    serialPrintf("-- ESP=0x%x EIP=0x%x EBP=0x%x isChild=%i\n", esp, eip, ebp, currentProcess->isChild);

    // Jump to it!
    asm volatile (
        "mov %0, %%ebx\n"
        "mov %1, %%esp\n"
        "mov %2, %%ebp\n"
        "jmp %%ebx"
        :: "r"(eip), "r"(esp), "r"(ebp) : "%ebx", "%esp", "%eax"
    );
}

Here is the code that is called on each PIT tick (if the registers passed by ISR handler have usermode CS):

// process_switchTask(uint8_t reschedule) - Yields the current process and allows the next to run
void process_switchTask(uint8_t reschedule) {
    if (!currentProcess) return; // Scheduler disabled

    if (currentProcess == idleTask) {
        panic("scheduler", "process_switchTask", "Context switch from idleTask triggered from somewhere other than pre-emption source.");
    }

    // If a process got to switchTask but was not marked as running, it must be exiting.
    if (!(currentProcess->flags & PROCESS_FLAG_RUNNING) || (currentProcess == idleTask)) {
        serialPrintf("we are going to just switch to the next process\n");
        process_switchNext();
        return;
    }

    // Save the FPU registers (TODO: move this to fpu.c)
    asm volatile ("fxsave (%0)" :: "r"(&currentProcess->thread.fp_regs));
    
    if (save_context(&currentProcess->thread.context) == 1) {
        // THIS CODE WILL NOT BE CALLED.
        // PLEASE SEE PIT.C FOR THE REAL WAY WE DO THIS
        panic("??", "??", "how the actual hell");
        asm volatile ("fxrstor (%0)" :: "r"(&currentProcess->thread.fp_regs));
        return;
    } 


    serialPrintf("process_switchTask(%i) called - SP=0x%x BP=0x%x TLSBASE=0x%x IP=0x%x isChild=%i\n", reschedule, currentProcess->thread.context.sp, currentProcess->thread.context.bp, currentProcess->thread.context.tls_base, currentProcess->thread.context.ip, currentProcess->isChild);
    

    // If this is a normal yield, we nede to reschedule.
    if (reschedule) {
        makeProcessReady(currentProcess);
    }



    // switch_next() will not return
    process_switchNext();
} 

save_context manipulates the structure like so:

save_context:
    mov ecx, [esp + 4]
    lea eax, [esp + 8]
    mov dword [ecx + 0], eax
    mov dword [ecx + 4], ebp
    xor eax, eax
    mov dword [ecx + 8], eax ; No one likes TLS_BASE (TODO: You can actually do this call with something like a rdmsr)
    mov eax, esp
    mov dword [ecx + 12], eax

    ; Basically, besides setting the above fields, we also need to do a setjmp-type thing.
    ; Stuff the saved[5] structure with the following:
    ; EBX, EDI, ESI, <reserved>, <reserved>
    mov dword [ecx + 16], ebx
    mov dword [ecx + 20], edi
    mov dword [ecx + 24], esi
    mov dword [ecx + 28], ebp 
    mov dword [ecx + 32], esp

    xor eax, eax
    ret

I stepped through save_context in GDB and made sure that it was saving what I thought it was saving correctly (not entirely sure about the lea though).

I also ran a test where I typecast a registers_t* object onto the stack before jumping to EIP, and the registers seemed fine (CS = 0x23, EAX and EBX were fine).

The test program is supposed to fork its process, call a random dummy system call 3 times to make sure that it is working properly, then check if the return value was 0, and if so, halt the system with an "OK" message after.

The program seems to try to execute, and while it appears to be random I saw that it fails after a PIT tick. Here is some example output:

createProcess: stack = 0x1ae580
Test program started
syscall: Returning 2
Forking process
fork: Beginning fork
fork: pushing system call registers to address 0x609040
fork: Thread forked. IP=0x12156b.
fork: parent SP=0x1ae430 BP=0x1ae4a4 IP=0x1ae428
syscall: Returning 2
Fork done
syscall: Returning 2
process_switchTask(1) called - SP=0x1ae4b0 BP=0x1ae4e4 TLSBASE=0x0 IP=0x1ae4a8 isChild=0
-- switchNext() jumping to ESP=0x609000 EIP=0x12156b EBP=0x609040 isChild=1
===========================================================
panic() called! FATAL ERROR!
*** ISR threw exception: Invalid opcode
panic type: ISR Exception.

err_code 0
REGISTER DUMP:
eax=0x23, ebx=0x80480e6, ecx=0x8048174, edx=0x2
edi=0x9, esi=0xa, ebp=0x804810b, esp=0x609024
eip=0x8048211, cs=0x23, ss=0x202, eflags=0x80492b0, useresp=0x0

Edit: The bug cause seems to be related to the stack that I create for the process, both parent and child. I can't PUSH anything onto the stack, or I'm not checking that its even correctly on the stack in the first place.

I tried to pull from the stack in many ways, but it doesn't seem to work at all.

Upvotes: 0

Views: 56

Answers (1)

user25873577
user25873577

Reputation: 41

The fix was because I wasn't properly restoring context, and the code would never hop out after a task switch. I fixed the issues and the code works fine.

Specifically I forgot to perform mov eax, 1 before jumping to the IP, and besides cleaning up some other parts of the code, it all works fine now.

Upvotes: 1

Related Questions