Reputation: 183
When switching from compatibility mode to 64-bit mode at the same privilege level by a far call, fields such as BASE or LIMIT in segment registers are ignored, and 64-bit pointer registers are all available. However, if the BASE field is ignored, the stack base address will become 0x0 implicitly, which means an implicit stack switch will be raised, am I right? And if my understanding is correct, how does the CPU switch back to the original stack when returning via the ret instruction?
Upvotes: 2
Views: 195
Reputation: 12455
The SDM does use the term "implicit stack switch" to describe this behavior, but only when the call is performed using a call gate.
... when using a call gate to perform a far call to a segment at the same privilege level, an implicit stack switch occurs as a result of entering 64-bit mode. The SS selector is unchanged, but stack segment accesses use a segment base of 0x0, the limit is ignored, and the default stack size is 64 bits. The full value of RSP is used for the offset, of which the upper 32 bits are undefined.
Although the SDM doesn't describe the behavior when the call is performed without using a call gate, the behavior I observe is the same. The return address (cs:rip) is pushed to the new stack (ignoring ss.base) and a far ret to return to the 32 bit code works. And of course the 32 bit code continues to use ss.base as usual.
(In my test, the upper bits of RSP were 0 before entering compatibility mode, and remain 0 when performing the call back into 64-bit mode.)
Here are memory dumps after running a program that does the following:
ss.base = 0x80
esp = 8385f118
push 32323232
push esp
far call to 64-bit code (return address is 20:776bb1e2)
push 64646464
push rsp
far call to 32-bit code (return address is 38:776bb208)
push 30303030
push esp
add esp, 8
far ret
add rsp, 10
far ret
push 31313131
push esp
8385f0f0: 00000000 00000000 8385f100 00000000
8385f100: 64646464 00000000 776bb1e2 00000020
8385f110: 00000000 00000000 00000000 00000000
8385f120: 00000000 00000000 00000000 00000000
8385f160: 00000000 00000000 8385f0ec 30303030
8385f170: 776bb208 00000038 00000000 00000000
8385f180: 00000000 00000000 8385f10c 31313131
8385f190: 8385f114 32323232 00000000 00000000
Upvotes: 3
Reputation: 365517
Note that in x86 terminology, "stack switching" has a specific technical meaning involving loading a new SS:[ER]SP from the TSS, e.g. when user-space runs an int
instruction. This is not what you're talking about, just that ss.base + esp
might be different from 0 + rsp
.
Yes, I think that's correct, in the unusual case where your 32-bit code had SS.base != 0.
All mainstream OSes use a flat memory model with CS/DS/ES/SS bases all zero (and limit=-1), so this is a non-issue.
BTW, Windows x64 does this user-space mode switch in practice in its DLLs, as part of its WoW64 (i.e. Windows (32-bit) on Windows64). Instead of using sysenter
or whatever to call into the kernel directly, user-space makes a far call
to 64-bit code that uses sycall
.
That would be a lot less convenient if 32-bit code was running with a non-zero SS base. (I'm not sure whether the CS:EIP return address would be pushed at ss.base + ESP
or at [RSP]
; if the latter, it would still be possible for the 64-bit code to actually return if memory was allocated at rsp
as well as ss.base + rsp
.)
Upvotes: 2
Reputation: 37222
Will an implicit stack switch occur when switching from compatibility mode to 64-bit mode at the same privilege level?
No implicit stack switch will occur when switching from compatibility mode to 64-bit mode at the same privilege level; unless it's done by an interrupt using the IST mechanism.
However, if the BASE field is ignored, the stack base address will become 0x0 implicitly, which means an implicit stack switch will be raised, am I right?
For 80x86; in general a linear address is calculated by adding an offset within a segment to the base of a segment, where the base of the segment is stored in a "hidden" part of the segment register. For example, in 32-bit code, if you do mov eax,[esp]
then the CPU calculates "linear_address = SS.base + ESP".
Because most operating systems use a "flat memory model" where segments are effectively disabled (by setting all segments bases to zero and all segment limits to "max"); CPUs optimized this specific case such that if segment bases are known to be zero the addition is skipped (e.g. for that mov eax,[esp]
CPU may cheat and do "linear_address = ESP" if it already knows SS.base is zero).
For 64-bit code, excluding FS and GS segment registers, the value in the "hidden" part of the segment register (used for segment base) is assumed to be zero regardless of whether it is or not; and the CPU knows the segment base is always "assumed to be zero" and always optimizes the address calculation (by not adding segment base).
When switching from compatibility mode to 64-bit mode at the same privilege level; there are 2 possibilities:
a) The segment base for SS was zero in compatibility mode, and becomes "assumed zero" in 64-bit, and therefore the address of the stack doesn't change (see note).
b) The segment base for SS was non-zero in compatibility mode, and becomes "assumed zero" in 64-bit, and therefore the address of the stack does change.
The latter possibility (non-zero SS segment base in compatibility mode) would be something I'd strongly avoid; as it's horribly confusing for programmers, and has a "higher than normal" risk of quirks/errata (e.g. a future CPU doing things in a slightly different order and storing return information at "ss.base + esp" instead of at "0 + esp").
Note: In 64-bit code, anything that only updates the lower half of a register causes the upper half of the 64-bit register to be zeroed (e.g. loading the value 0x9ABCDEF0 into ESP will cause RSP to be set to 0x000000009ABCDEF0). I don't think this happens in compatibility mode (or at least, I don't think it's guaranteed to happen). This may potentially create a situation where "junk" left in the higher half of RSP before switching to compatibility mode is still present when you switch back to 64-bit (e.g. possibly causing the address of the stack to change from 0x9ABCDEF0 to 0x123456789ABCDEF0).
Upvotes: 3