Reputation: 11706
As I was digging through the original Xbox kernel's code, I noticed that sometimes when it sets up the registers for port I/O, it assigns a 32-bit value to edx
, even though the in
and out
instructions only use the low 16 bits of edx
for the port address. As an example:
mov edx, 0FFFF8004h
in ax, dx
or ax, 1
out dx, ax
add edx, 1Eh
in ax, dx
or ax, 2
out dx, ax
mov edx, 0FFFF8002h
...
Elsewhere (such as SMBus read and write), it's inconsistent; sometimes it assigns 16-bit values to dx
, other times 32-bit values to edx
.
If the upper 16 bits are never used, what's the point of specifying non-zero bits for them?
Upvotes: 2
Views: 335
Reputation: 39581
My guess is that's done as micro-optimization to avoid a non-existent hazard and/or insignificant performance penalty.
For example, the programmer may have originally wrote something like:
66| BA 8004 mov dx, 8004h
66| ED in ax, dx
66| 83 C8 01 or ax, 1
66| EF out dx, ax
66| 83 C2 1E add dx, 1Eh
He then decided to replace add dx
with add edx
in order to save a byte and eliminate the performance penalty for decoding the operand size prefix:
66| BA 8004 mov dx, 8004h
66| ED in ax, dx
66| 83 C8 01 or ax, 1
66| EF out dx, ax
83 C2 1E add edx, 1Eh
Then he reads this in a contemporary Intel optimization manual:
Because Pentium II and Pentium III processors can execute code out of order, the instructions need not be immediately adjacent for the stall to occur. Example 2-7 also contains a partial stall.
Example 2-7 Partial Register Stall with Pentium II and Pentium III Processors
MOV AL, 8 MOV EDX, 0x40 MOV EDI, new_value ADD EDX, EAX ; Partial stall accessing EAX
His own code now looks similar so he avoids the partial register stall by replacing the 16-bit MOV
instruction with the 32-bit one you see in your example. (In reality I don't think ADD
instruction will ever stall, the IN
and OUT
instructions should give the MOV
instruction more than enough time to retire.)
And yes, these micro-optimizations would be pointless. Even if they do save a CPU cycle or two, the performance gain would be insignificant compared to time it takes to execute the I/O instructions. But it wouldn't be at all surprising to see a Microsoft employee doing this. I've seen dumber things than this in Microsoft code, and during the 90's at least they seemed pretty obsessed with micro-optimizations.
The inconsistency you see is also not surprising. Microsoft would have had a number of different programmers working on the Xbox kernel, and could have easily included code from Windows or other projects.
Upvotes: 2