Reputation: 165242
Consider this x86 assembly code:
section .data
foo:
mov ebx, [boo]
mov [goo], ebx
goo:
mov eax, 2
mov eax, 3
ret
boo:
mov eax, 4
mov eax, 5
ret
What exactly is going on here? When I dereference [boo]
and mov
it to [goo]
what exactly am I moving there? Just one command? The ret
as well?
Follow-up questions:
eax
have a value of 3 or 5 at the end?Upvotes: 7
Views: 10478
Reputation: 61795
The first mov is copying from the offset goo relative to the segment register [e]DS. The second mov is writing at the offset of foo into a data location relative to the DS register. If the CS and DS are coincidental, then this can be ignored. Assuming the CS and DS are coincidental, you're next likely to run into various protection mechanisms that render code sections read-only.
RE followups:
mov eax, 2
(which, due to the little endian encoding does get replaced with 4 but then gets overwritten by the next instruction which hasnt been modified at all - 5 is never in the picture as a candidate (I thought you were thinking the code gets reordered the way you first asked the question[1] though you clearly know quite a bit more as I should have guessed from your rep :P)]).Note that all of this assumes that CS = DS and DEP isnt stepping in.
Also, if you were using BX instead of EBX, the sort of things you were expecting will come into play (using xX instead of ExX accesses the low 2 bytes of the register [and xL accesses the lowest byte])
[1] Remember that an assembler is purely a tool for writing opcodes - stuff like labels etc. all get boiled down to numbers etc. with very little magic or impressive transformations of the code - there's no closures or anything deep of that nature lurking in there. (This is slightly oversimplifying - code can be relocatable, and in many cases fixups get applied to usages of offsets by a combination of the linker and the loader)
Upvotes: 3
Reputation: 30439
Follow up answers:
It gives you the machine code starting at the address. How much of that depends of the length of your load, in this case it is 4 byte.
It can be more than one command or only a fragment of a command. On this architecture (Intel x86) machine code commands are between 8 and 120 Bit.
3.
Upvotes: 2
Reputation: 61713
boo
is the offset of the instruction mov eax, 3
inside section .data
.
mov ebx, [boo]
means “fetch four bytes at the offset indicated by boo
inside ebx
”.
Likewise, mov [goo], ebx
would move the content of ebx at the offset indicated by goo
.
However, code is often read-only, so it wouldn't be surprising to see the code just crashing.
Here is how the instructions at boo
are encoded:
boo:
b8 03 00 00 00 mov eax,0x3
c3 ret
So what you get in ebx
is actually 4/5 of the mov eax, 3
instruction.
Upvotes: 10