zapshe
zapshe

Reputation: 278

MOVZX and CWD - Are they interchangeable?

I have these two code snippets:

mov   ax, word [wNum2]
cwd
div   word [wNum3]
mov   [wAns16], dx
movzx eax, word [wNum2]
;cwd
div   word [wNum3]
mov   [wAns16], edx

The first produces the correct answer, the second will give me an answer that is a hundred or so off unless I uncomment cwd.

My question is that I thought movzx would zero everything out for me, and that would make cwd redundant. Have I completely misunderstood how they work? Can someone walk me through these code snippets?

Upvotes: 1

Views: 708

Answers (2)

zx485
zx485

Reputation: 29052

The bare result can be equivalent or not - that depends on the value. The description of CWD states

Doubles the size of the operand in register AX, EAX, or RAX (depending on the operand size) by means of sign extension and stores the result in registers DX:AX, EDX:EAX, or RDX:RAX, respectively. The CWD instruction copies the sign (bit 15) of the value in the AX register into every bit position in the DX register.

So if the value in AX is smaller than or equal to 32,767 (15 bit MAX), the result of it is equivalent to MOVZX (zero extend) and MOVSX (sign extend). But if the value is bigger than 32,767 it would only be equivalent to MOVSX. Usually MOVZX would be used in combination with DIV(unsigned division) and MOVSX in combination with IDIV(signed division).

But there remains the problem of where the result will be stored:
CWD stores the 32-bit result in two 16-bit registers DX:AX, while the MOV?X instructions store it in the 32-bit register EAX.

This has consequences on the following DIV instruction. The first part of your code uses the 32-bit value in DX:AX as input, while the second approach assumes EAX to be the input of a 16-bit DIV:

F7 /6   DIV r/m16   M   Valid   Valid   Unsigned divide DX:AX by r/m16, with result stored in AX ← Quotient, DX ← Remainder. 

which makes the result unpredictable, because DX is undefined and the higher half of EAX is unused in the division.

Upvotes: 7

Peter Cordes
Peter Cordes

Reputation: 365971

No, MOVZX is zero extension, not sign. And CWD sign-extends AX into DX:AX (like you want before IDIV, not DIV).

movSx eax, word [wNum2] is a more efficient way to do mov ax,mem + CWDE, not CWD. (If your inputs are known to be non-negative when treated as signed, sign and zero extension do the same thing).

What does cltq do in assembly? has a table of cbw/cwde/cdqe and the equivalent movsx instruction, and what cwd/cdq/cqo do (and the equivalent mov/sar).

None of these things are what you want before unsigned div: use xor edx,edx to zero DX, the high-half input for 32/16 => 16-bit division.

See also When and why do we sign extend and use cdq with mul/div?


To avoid false dependencies from writing partial registers, on most recent CPUs the most efficient thing would be to do a movzx load just to get your 16-bit value into AX without merging into the previous value of RAX/EAX. Similarly, xor-zeroing isn't (usually?) recognized as a zeroing idiom on partial registers so you want 32-bit operand-size even if you're only going to read the low half of

   movzx eax, word [wNum2]      ; zere extend only to avoid false dep from merging into EAX
   xor   edx, edx               ; high half dividend = DX = 0
   div   word [wNum3]
   mov   [wAns16], dx           ; store remainder from DX, not EDX

Your code storing 32-bit EDX into [wAns16] is presumably a bug, assuming there's only 2 bytes of space there before you step on whatever comes after it.

Upvotes: 3

Related Questions