Reputation: 667
I'm trying to make a simple x86 disassembler (32-bit for now) for learning purposes.
So the intel docs go:
But I find this very confusing.
First of all, the m8-32 operands seem to indicate either ES:(E)DI
or DS:(E)SI
.
But there's no telling in which situations one or the other would be the case.
In some opcodes you have OPCODE m8, m8
, in others you have only one operand that's m8
, and after checking across multiple, I've come to the conclusion that there's no general rule.
Then there are these others, that are simply described as memory operand in memory
, which leave me even more confused. Is there supposed to be a displacement, maybe an absolute address or relative offset? If so what's even the point, since we have moffs
and rel
?
The ones after make some sense, but is the number after the colon a displacement?
The ampersand ones leave me completely clueless though.
Besides that, there are these m[number][descriptor]
, which as far as I can see are for FPU? (I haven't been dealing with the 0Fh escaped opcodes yet).
I'm sorry for I'm probably missing something really obvious, as I often do.
Thanks in advance.
Upvotes: 3
Views: 883
Reputation: 364532
Normal instructions like add
that can use a memory operand also work with registers, so ADD has encodings for add r32, r/m32
and add r/m32, r32
. add eax, ecx
can use either encoding / opcode (doesn't matter).
That's why m32
(and not r/m32
) is usually only an implicit operand for movsd
or stosd
or other string instructions, and why Intel says they normally use ES:(E)DI
or DS:(E)SI
.
First of all, the m8-32 operands seem to indicate either ES:(E)DI or DS:(E)SI. But there's no telling in which situations one or the other would be the case.
m32
means a 32-bit memory operand, which can't be a register instead. Look at the entries for specific instructions to see how the operand(s) are specified, (e.g. DS:(E/R)SI
is implicit for lodsb/w/d/q
), while others might use a ModR/M operand but require it to be memory.
For x87, the extra annotation tells you how the instruction interprets it. e.g. m32fp
is a 32-bit IEEE single-precision float
(e.g. for fmul
or fld
), while m32int
is a 32-bit integer (e.g. for fimul
or fild
).
Other than x87, the number just tells you the operand-size. That's all.
Normally memory operands are specified with the usual ModR/M + optional SIB. The only exceptions are implicit addressing modes (like pop rax
reading qword [rsp]
, or the string instructions), or the moffs
forms of MOV which skip the ModR/M byte and just use a 16/32/64-bit offset (same size as the address-size).
mov al/ax/eax/rax, [moffs8/16/32/64]
(or the store form) is the only instruction that can use a 64-bit absolute address directly, without putting it in a register first.
Note that moffs8
is an 8-bit operand, not an 8-bit immediate address. The address-size attribute of the instruction (default 64-bit in 64-bit mode, overrideable with a 0x67
address-size prefix) determines how many bytes of absolute address follow the opcode.
The assembler will take care of this for you, and use the moffs
encoding when it saves code-size for mov eax, [symbol]
in 32-bit code. In general, just write addressing modes the normal way ( Referencing the contents of a memory location. (x86 addressing modes)) and let the assembler generate ModR/M bytes, or warn you if you do something illegal (not encodeable) like try to use movsb
with different registers.
For more about x86 asm, see the x86 tag wiki. Also, Agner Fog's guides are very good, although he doesn't attempt to cover basic stuff like this. However, reading Agner's guides and seeing what he says about his short examples (a couple instructions long) will help you make sense of how asm works.
Upvotes: 4
Reputation: 667
I've just found that ref.x86asm.net has a "geek" edition of it's tables.
The opcodes are described here.
The geek version is not ambiguous as the coder is.
Still, if someone could direct me to where one would learn this by himself, it would be greatly appreciated. I don't seem to be able to find it in the intel docs, or anywhere else besides x86asm.
Again, I often miss stuff, so in case I find something I will edit.
Hope I could help, have a nice one.
Upvotes: 0