Reputation: 87
I just had a go at this problem which asks you to explain what is wrong with the line of code:
movl %eax, %rdx
The solution says the destination operand is the wrong size.
Is it only "illegal" if going from a larger size to a smaller size, or is it the case that source and destination operands must be the same size for all instructions (or at least mov class types)?
Upvotes: 2
Views: 818
Reputation: 365517
Yes, operands have to be the same size except for a few special instructions like shl %cl, %eax
or movzwl %ax, %edx
.
CPUs execute machine code, not assembly. In machine code, there's the opcode and prefixes (along with the default provided by being in 64-bit mode) to specify the operand-size. Not separate size attributes for each operand; that would be a waste of bits.
Assembly language is a text format for describing / specifying machine instructions.
The ISA designer (Intel, then AMD for 64-bit mode) chose to define the semantics of partial registers in the instruction set in terms of narrow operand sizes. With the effect on the full register being merging when you write AL, AH, or AX (defined by Intel in 8086 and 808386, with AMD64 matching that), or implicit zero-extending into RAX when you write EAX (new semantics in AMD64 for the part of the register that was new).
See Why do x86-64 instructions on 32-bit registers zero the upper part of the full 64-bit register? and Intel's instruction-set reference for instructions like mov
vs. movsx
(and movsxd
which as new with AMD64).
In the asm text syntax, there's no machine code that corresponds to movl %eax, %rdx
.
An assembler correctly tells you that's meaningless.
That's also why AT&T syntax attaches an operand-size suffix (b
/w
/l
/q
) to the mnemonic, not to each operand separately. The only ambiguity is instructions with no register operands, just immediate and memory like andl $1, (%rdi)
vs. andb $1, (%rdi)
or notq (%rdi, %rsi, 8)
.
There is an instructions movsld %eax, %rdx
to sign-extend from 32 to 64-bit. (There is no movzx
for 32 to 64; that's implicit in mov %eax, %edx
: MOVZX missing 32 bit register to 64 bit register, only 8 or 16-bit source operands like movzbl %al, %edx
.)
There are other special instructions with different sized operands, for example shifts like shl %cl, %edx
.
This syntax design works well for loads / stores like mov %eax, (%rdi)
or add (%rdi), %esi
, where the register operand implies the memory operand size. If mov
could have two separate sizes, you'd always need to indicate the memory operand-size, like you do as part of the mnemonic for movzx/movsx. (e.g. AT&T syntax movzbl (%rdi), %eax
to zero-extend a byte from memory into RAX implicitly, by explicitly writing EAX.)
Other designs for a text syntax to describe machine code would be possible, e.g. you could invent a syntax where movl %eax, %rdx
is just making the zero-extension explicit, and perhaps movl %eax, %edx
wouldn't be allowed because there's no way to write a 32-bit register without implicitly zero-extending to 64-bit. And then you could define movl (%rdi), %rdx
as being a 32-bit load (implied by the l
suffix) zero-extending into the 64-bit RAX. i.e. what we currently define in AT&T syntax as movl (%rdi), %edx
.
I think that hypothetical design would be less intuitive than just saying most instructions require all their operands to be the same width. And in practice it's not how AT&T syntax is designed. Instead, AT&T went with the same conventions as Intel / AMD use in their manuals, just with operand order reversed. I'm not aware of any syntax for any ISA that works this way; when writing a narrow register implicitly zero-extends, that's left implicit in the asm text syntax (e.g. in AArch64, and all the various x86-64 syntaxes; Intel, AT&T, Plan9/Go)
References:
Is there a default operand size in the x86-64 (AMD64) architecture? - yes, in 64-bit mode it's 32-bit for most opcodes, except for the byte operand-size opcodes. 16 and 64-bit sizes are signalled by prefixes.
Why is default operand size 32 bits in 64 mode? - and that choice makes sense because 32-bit int
is large enough and commonly used, among other historical reasons.
and Intel's manuals and other links in https://stackoverflow.com/tags/x86/info
Why doesn't MOVZX work when operands have the same size? - it actually does, but inefficiently so most assemblers reject it.
Upvotes: 5