aCuria
aCuria

Reputation: 7205

Compiler flags change code behavior (O2, Ox)

The following code works as expected with flags Od, O1 but fails with O2, Ox. Any ideas why?

edit: by "fails" I mean that the function does nothing, and seems to just return.

void thread_sleep()
{
    listIterator nextThread = getNextThread();
    void * pStack = 0;
    struct ProcessControlBlock * currPcb = pPCBs->getData(currentThread);
    struct ProcessControlBlock * nextPcb = pPCBs->getData(nextThread);

    if(currentThread == nextThread)
    {
        return;
    }
    else
    {
        currentThread = nextThread;
        __asm pushad            // push general purpose registers
        __asm pushfd            // push control registers
        __asm mov pStack, esp   // store stack pointer in temporary

        currPcb->pStack = pStack;   // store current stack pointer in pcb
        pStack = nextPcb->pStack;   // grab new stack pointer from pcb

        if(nextPcb->state == RUNNING_STATE)// only pop if function was running before
        {
            __asm mov esp, pStack       // restore new stack pointer
            __asm popfd
            __asm popad;
        }
        else
        {
            __asm mov esp, pStack       // restore new stack pointer
            startThread(currentThread);
        }
    }
}

// After implementing suggestions: (still does not work)

listIterator nextThread = getNextThread();
struct ProcessControlBlock * currPcb = pPCBs->getData(currentThread);
struct ProcessControlBlock * nextPcb = pPCBs->getData(nextThread);
void * pStack = 0;
void * pNewStack = nextPcb->pStack; // grab new stack pointer from pcb
pgVoid2 = nextPcb->pStack;

if(currentThread == nextThread)
{
    return;
}
else
{
    lastThread = currentThread; // global var
    currentThread = nextThread;


    if(nextPcb->state == RUNNING_STATE)// only pop if function was running before
    {
        __asm pushad                // push general purpose registers
        __asm pushfd                // push control registers
        __asm mov pgVoid1, esp      // store stack pointer in temporary
        __asm mov esp, pgVoid2      // restore new stack pointer
        __asm popfd
        __asm popad;

        {
            struct ProcessControlBlock * pcb = pPCBs->getData(lastThread);
            pcb->pStack = pgVoid1; // store old stack pointer in pcb
        }
    }
    else
    {
        __asm pushad                // push general purpose registers
        __asm pushfd                // push control registers
        __asm mov pgVoid1, esp  // store stack pointer in temporary
        __asm mov esp, pgVoid2      // restore new stack pointer

        {
            struct ProcessControlBlock * pcb = pPCBs->getData(lastThread);
            pcb->pStack = pgVoid1; // store old stack pointer in pcb
        }
        startThread(currentThread);
    }
}

Upvotes: 2

Views: 372

Answers (2)

exa
exa

Reputation: 940

As you use inline assembler, you'd probably want to see how's (or whether is) the code really modified when it's compiled with various -Ox options. Try this on your binary:

objdump -s your_program

It gives a heap of code, but finding the corresponding code section shouldn't be that hard (search for your assembly or for function names).

By the way, I was taught that heavy optimization doesn't do very well with inline assembly, so I tend to separate assembler routines to .S files because of this.

Upvotes: 2

caf
caf

Reputation: 239041

It is likely because your compiler is not using a specific frame pointer register on the higher optimisation levels, which frees up an additional general-purpose register.

This means that the compiler accesses the local variable pStack using an offset from the stack pointer. It cannot do this correctly after the stack pointer has been adjusted by the pushad and pushfd - it is not expecting the stack pointer to change.

To get around this, you shouldn't put any C code after those asm statements, until the stack pointer has been correctly restored: everything from the first pushad to the popad or startThread() should be in assembler. This way, you can load the address of the local variables and ensure that the accesses are done correctly.

Upvotes: 3

Related Questions