Reputation: 781
Is there a way on aarch64 to load a single byte from memory into a register without zero extending it? I know LDRB for example zero extends.
Upvotes: 0
Views: 226
Reputation: 58558
Not directly. A general principle of the ARM instruction set is that, for the majority of instructions, the destination register is completely overwritten. One benefit of this is that the instruction does not incur a read dependency on the destination register, making it more efficient to execute out-of-order.
So when you read a single byte from memory, the remaining 56 bits of the destination register has to be overwritten with something. Your options are:
ldrb w0, [x1] // bits 8:63 of x0 are cleared
ldrsb w0, [x1] // bits 8:31 set equal to bit 7, bits 32:63 are cleared
ldrsb x0, [x1] // bits 8:63 set equal to bit 7
If you want to do something else with the high bits, you'll need one or more extra instructions. For instance, if you want your loaded byte in the low 8 bits of a particular register while keeping bits 8:63 the same, you could do
ldrb w2, [x1]
bfi x0, x2, #0, #8
You can insert it at some other position besides #0
if you want. But you do need a scratch register (here x2
) for the load instruction itself. (Note bfi
and its relatives are an exception to the principle mentioned above, in that they do "merge" into the destination register.)
On the SIMD side, it is possible to use ld1
to load into a particular lane of a SIMD register, keeping the remaining lanes unchanged. So you could do
ld1 v0.b[11], [x1]
to load a byte into element 11 of SIMD register 0.
Upvotes: 3