Why would one use "movl $1, %eax" as opposed to, say, "movb $1, %eax"

As the title states, why would one use "movl $1, %eax" as opposed to, say, "movb $1, %eax", I was told that movl would zero out the high order bits of %eax, but isn't %eax a register that's equivalent to the size of the system's wordsize? meaning that movl is in actuality an integer operation (and not a long?)

I'm clearly a bit confused about it all; Thanks.

Upvotes: 23

Views: 50164

Answers (6)

Jason
Jason

Reputation: 2371

opposed to, say, "movb $1, %eax"

This instruction is invalid. You can't use eax with the movb instruction. You would instead use an 8-bit register, or write the full register with a value that has the value you want in the low byte(s) you care about. For example:

movb  $1, %al        # AL = 1, merged with existing 3/7 bytes of previous EAX/RAX
movl  $1, %eax       # AL = 1, AX = 1, EAX = 1, RAX = 1

but isn't %eax a register that's equivalent to the size of the system's wordsize?

No. EAX will always be a 32-bit value, regardless of what mode you're in.

You are confusing C variable sizes with register sizes. C variable sizes may change depending on your system and compiler.

Assembly is simpler than C. In GAS AT&T assembly, instructions are suffixed with the letters "b", "s", "w", "l", "q" or "t" to determine what size operand is being manipulated. (Or a register operand can imply a size, like mov $1, %eax implies movl, unlike with mov $1, (%rdi) which is ambiguous.

  • b = byte (8 bit)
  • s = single (32-bit floating point) used only for x87 instructions like flds (mem)
  • w = word (16 bit)
  • l = long (32 bit doubleword integer), or x87 64-bit floating point
  • q = quad-word (64 bit)
  • t = ten bytes (80-bit floating point)

These sizes are constant. They will never be changed. al is always 8-bits and eax is always 32-bits.

A "word" is always 16 bits in x86 terminology, and in modern x86 is unrelated to the concept of CPU or machine word size; x86 isn't a word-oriented architecture.

Upvotes: 45

Aaron Klotz
Aaron Klotz

Reputation: 11585

%eax is a 32-bit register. To use a smaller width, you need %ax for 16-bits. %ax can be further divided into %ah for the high byte of %ax, and %al for the lower byte. The same goes for the other x86 GPRs.

Looking at the Intel instruction set reference for the mov instruction, I don't see a variant that can move a single byte into a 32-bit register -- it's probably interpreted as a move into %al.

Since movl is a 32-bit instruction, the values for the upper bytes will correspond to zeros in the case of an immediate value. If you were moving from memory you would be moving an entire 32-bit word.

%eax is not zeroed out unless you either movl $0, %eax, or if you xorl %eax, %eax. Otherwise it holds whatever value was previously in there. When you movl $1, %eax, you will end up with 0x00000001 in the register because the 32-bit instruction moves a 32-bit immediate value into the register.

Upvotes: 7

DigitalRoss
DigitalRoss

Reputation: 146221

Your second choice will just produce an error, x86 doesn't have that instruction. X86 is a bit unique with respect to loading bytes into certain registers. Yes, on most instruction set architectures the operand is zero or sign-extended, but x86 allows you to write just the lower byte or lower 16 bits of some of them.

There are certainly other choices, like clearing the register and then incrementing it, but here are three initially reasonable-looking choices you have:

   0:   b8 01 00 00 00          movl   $0x1,%eax

   5:   31 c0                   xorl   %eax,%eax
   7:   b0 01                   movb   $0x1,%al

   9:   b0 01                   movb   $0x1,%al
   b:   0f b6 c0                movzbl %al,%eax

The first is 5 bytes, the second 4, the third 5. So the second is the best choice if optimizing for space, otherwise I suppose the one most likely to run fast is the first one. X86 is deeply pipelined these days, so the two instructions will interlock and the machine may need quite a few wait states depending on details of the pipeline hardware.

Of course, these x86 ops are being translated in CPU-specific ways into CPU micro-ops, and so who knows what will happen.

Upvotes: 3

Eli Bendersky
Eli Bendersky

Reputation: 273776

%eax is 32 bits on 32-bit machines. %ax is 16 bits, and %ah and %al are its 8-bit high and low constituents.

Therefore movl is perfectly valid here. Efficiency-wise, movl will be as fast as movb, and zeroing out the high 3 bytes of %eax is often a desirable property. You might want to use it as a 32-bit value later, so movb isn't a good way to move a byte there.

Upvotes: 2

Victor Shnayder
Victor Shnayder

Reputation: 103

On a 32 bit machine, %eax is a 4 byte (32 bit) register. movl will write into all 4 bytes. In your example, it'll zero out the upper 3 bytes, and put 1 in the lowest byte. The movb will just change the low order byte.

Upvotes: 7

Anon.
Anon.

Reputation: 60033

long was originally 32 bits, while int and short were 16. And the names of the opcodes don't change every time someone comes out with a new operating system.

Upvotes: 4

Related Questions