Hěbą Ňąjàř
Hěbą Ňąjàř

Reputation: 11

why we don't use high byte in R8 to R15?

I want to ask why we deal with the low byte in R8 to R15 , why we don't use the high byte ? we can use low byte but high no

Upvotes: 1

Views: 2856

Answers (1)

Peter Cordes
Peter Cordes

Reputation: 363980

As Jester already said in comments, there aren't any spare bits to encode r8h vs. r8b in the machine code.

In many instructions (e.g. mov), r/m8 can't encode AH/BH/CH/DH if a REX prefix is used at all. See the Intel insn ref manual, and look for In 64-bit mode, r/m8 can not be encoded to access the following byte registers if a REX prefix is used: AH, BH, CH, DH.

mov  ah, r8b   ; not encodable

yasm gives the error message:

error: cannot use A/B/C/DH with instruction needing REX

AMD decided it was more useful (and more orthogonal, and maybe cleaner to implement in HW) to give access to the low byte of all 16 registers for 8bit instructions, instead of giving access to low and high bytes of some other set of 8 regs. It's not like you can do xor ebx, [rsi + ah * 4]. Instead, you have to movzx edx, ah / xor ebx, [rsi + rdx*4]. So being able to address the high byte doesn't typically help much.

It would prob. be more useful to be able to address all 4, or even all 8 bytes, of a single register (compared to having A/B/C/DH). An algorithm that wanted to do a 64b load and unpack the bytes separately could do that, instead of having to shift by 16 multiple times. (e.g. an error-correction algorithm doing LUT-based Galois-field multiplies for an array of GF16.)


So the main answer to your question is instruction encoding limitations. If it wasn't for that, we could have byte-addressable registers, so a lot of load/shift/mask code could just do something like movzx rdx, rax{5} to get the 5th byte of rax.

I just invented that {} syntax for this example. AVX512 uses similar {mask} syntax, and that's not what I'm talking about. (AVX512 will bring byte masks for instructions on vector registers, but the masks will be stored in 64bit registers (k0-k7), not in the instruction encoding.)

Upvotes: 6

Related Questions