Reputation: 93172
I wrote this little assembly file for amd64. What the code does is not important for this question.
.globl fib
fib: mov %edi,%ecx
xor %eax,%eax
jrcxz 1f
lea 1(%rax),%ebx
0: add %rbx,%rax
xchg %rax,%rbx
loop 0b
1: ret
Then I proceeded to assemble and then disassemble this on both Solaris and Linux.
$ as -o y.o -xarch=amd64 -V y.s
as: Sun Compiler Common 12.1 SunOS_i386 Patch 141858-04 2009/12/08
$ dis y.o
disassembly for y.o
section .text
0x0: 8b cf movl %edi,%ecx
0x2: 33 c0 xorl %eax,%eax
0x4: e3 0a jcxz +0xa <0x10>
0x6: 8d 58 01 leal 0x1(%rax),%ebx
0x9: 48 03 c3 addq %rbx,%rax
0xc: 48 93 xchgq %rbx,%rax
0xe: e2 f9 loop -0x7 <0x9>
0x10: c3 ret
$ as --64 -o y.o -V y.s
GNU assembler version 2.22.90 (x86_64-linux-gnu) using BFD version (GNU Binutils for Ubuntu) 2.22.90.20120924
$ objdump -d y.o
y.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <fib>:
0: 89 f9 mov %edi,%ecx
2: 31 c0 xor %eax,%eax
4: e3 0a jrcxz 10 <fib+0x10>
6: 8d 58 01 lea 0x1(%rax),%ebx
9: 48 01 d8 add %rbx,%rax
c: 48 93 xchg %rax,%rbx
e: e2 f9 loop 9 <fib+0x9>
10: c3 retq
How comes the generated machine code is different? Sun as generates 8b cf
for mov %edi,%ecx
while gas generates 89 f9
for the very same instruction. Is this because of the various ways to encode the same instruction under x86 or do these two encodings really have a particular difference?
Upvotes: 5
Views: 509
Reputation: 18247
You've not specified the operand size for the mov
, xor
and add
operations. This creates some ambiguity. The GNU assembler manual, i386 Mnemonics, mentions this:
If no suffix is specified by an instruction then as tries to fill in the missing suffix based on the destination register operand (the last one by convention). [ ... ] . Note that this is incompatible with the AT&T Unix assembler which assumes that a missing mnemonic suffix implies long operand size.
This implies the GNU assembler chooses differently - it'll pick the opcode with the R/M byte specifying the target operand (because the destination size is known/implied) while the AT&T one chooses the opcode where the R/M byte specifies the source operand (because the operand size is implied).
I've done that experiment though and given explicit operand sizes in your assembly source, and it doesn't change the GNU assembler output. There is, though, the other part of the documentation above,
Different encoding options can be specified via optional mnemonic suffix. `.s' suffix swaps 2 register operands in encoding when moving from one register to another.
which one can use; the following sourcecode, with GNU as
, creates me the opcodes you got from Solaris as
:
.globl fib
fib: movl.s %edi,%ecx
xorl.s %eax,%eax
jrcxz 1f
leal 1(%rax),%ebx
0: addq.s %rbx,%rax
xchgq %rax,%rbx
loop 0b
1: ret
Upvotes: 1
Reputation: 11716
Some x86 instructions have multiple encodings that do the same thing. In particular, any instruction that acts on two registers can have the registers swapped and the direction bit in the instruction reversed.
Which one a given assembler/compiler picks simply depends on what the tool authors chose.
Upvotes: 7