Reputation: 1096
What would be the shortest Intel x86-64 opcode for setting rax
to 1?
I tried xor rax,rax
and inc al
(in NASM syntax); which gives the 5-byte opcode 48 31 c0
fe c0
. Would it be possible to achieve the same result in 4 bytes?
You can modify or read any other registers, but cannot assume that a specific value would be on any one of them from previous instructions.
Upvotes: 7
Views: 2862
Reputation: 366066
With any known pre-conditions, there are some tricks that are more efficient (in terms of speed) than the push imm8/pop rax 3-byte solution.
For speed mov eax, 1
has many advantages, because it doesn't have any input dependencies and it's only one instruction. Out-of-order execution can get started on it (and anything that depends on it) without waiting for other stuff. (See Agner Fog's guides and the x86 tag wiki).
Obviously many of these take advantage of the fact that writing a 32-bit register zeros the upper half, to avoid the unnecessary REX prefix of the OP's code. (Also note that xor rax,rax
is not special-cased as a zeroing idiom on Silvermont. It only recognizes xor-zeroing of 32-bit registers, like eax or r10d, not rax or r10.)
If you have a small known constant in any register to start with, you can use
lea eax, [rcx+1] ; 3 bytes: opcode + ModRM + disp8
disp8 can encode displacements from -128 to +127.
If you have an odd number in eax, and eax, 1
is also 3 bytes.
In 32-bit code, inc eax
only takes one byte, but those inc/dec opcodes were repurposed as REX prefixes for AMD64. So xor eax,eax
/ inc eax
is 4 bytes in x86-64 code, but only 3 in 32-bit code. Still, if saving 1 byte over a mov eax,1
is sufficient, and LEA or AND won't work, this is more efficient than push/pop.
Upvotes: 2
Reputation: 9377
Since there is a byte immediate encoding for push and a one-byte pop for registers, this can be done in three bytes: 6a 01 58
, or push $1 / pop %rax
.
Upvotes: 8