I understand that instructions can be re-ordered by the processor in addition to compilers. I have a few questions that I can not get my head around. Say we have three instructions: Program order S1 S2 S3 After re-ordering by the processor, order becomes (for whatever reason): S3 S2 S1 So when the processor executes S1 (in the program order), what woul be the value of the Program Counter? If windows (or another OS), context switches the thread out and schedules it in another processor, how would the other processor know which instruction to execute next? (Is it guaranteed to make the same re-orderings?) Is a memory fence (for example, a full fence created by an atomic compare and swap instruction) on one processor valid after the thread is scheduled on another thread? Any ideas on this is highly appreciated.

windowsassemblyx86-64cpu-architecturememory-barriers

Thanuja Dilhan

Reputation: 137

Program counter, fences and processor re-ordering

I understand that instructions can be re-ordered by the processor in addition to compilers.

I have a few questions that I can not get my head around.

Say we have three instructions:

Program order

S1 S2 S3

After re-ordering by the processor, order becomes (for whatever reason):

S3 S2 S1

So when the processor executes S1 (in the program order), what woul be the value of the Program Counter?
If windows (or another OS), context switches the thread out and schedules it in another processor, how would the other processor know which instruction to execute next? (Is it guaranteed to make the same re-orderings?)
Is a memory fence (for example, a full fence created by an atomic compare and swap instruction) on one processor valid after the thread is scheduled on another thread?

Any ideas on this is highly appreciated.

Upvotes: 2

Answers (2)

Peter Cordes

Reputation: 365727

Unlike static compile-time ordering, out-of-order exec preserves the illusion of running instructions in program order. Including the situation seen by an interrupt handler. Current CPUs don't rename the privilege level, so they generally roll back to a consistent state as part of taking an exception or interrupt, not keeping un-executed instructions in flight. When an interrupt occurs, what happens to instructions in the pipeline?

This also means that interrupts are delivered strictly between instructions, not in the middle of one. Interrupting an assembly instruction while it is operating (except for "interruptible" instructions like rep movsb that logically work as multiple instructions, or vpgatherdd that has documented semantics for a page fault in one of the gather operands.)

Memory ordering as observed by other cores is another matter, and can differ from program order even on an in-order CPU. (Can a speculatively executed CPU branch contain opcodes that access RAM?)

The kernel code for a context switch needs to include a strong enough barrier for a thread to see its own stores in program order when it resumes on another core. Generally just release/acquire sync is sufficient (and you already need something like that for the kernel on the other core to restore register values). Maybe also an sfence to make that apply even for NT stores on x86.

Upvotes: 2

prl

Reputation: 12455

There is an instruction pointer associated with each instruction.
Although instructions may be executed out of order, they always complete in order. When an interrupt or fault occurs, all instructions preceding the saved IP address have been completed. The results of any subsequent instructions are discarded. When execution resumes, it starts at the saved address.
The steps taken by the OS to schedule a thread on another processor include fencing operations on both processors, so when the thread resumes on the new processor, all preceding operations are fully fenced (whether or not any explicit fences exist in the code of the thread).

Upvotes: 6

Program counter, fences and processor re-ordering

Answers (2)

Related Questions