Reputation: 29
I have some issues understanding two instructions I encountered.
The first one is as follow:
imul eax,DWORD PTR [esi+ebx*4-0x4]
Does this instruction means => Take the value at the address you calculate between brackets multiply it by eax and store it in that same register(eax)? If so do we calculate the address between brackets like that?
The second instruction I have issue decoding is this one
jmp DWORD PTR [eax*4+0x80497e8]
-Is eax *4 equivalent to index * scale ?
-Is 0x80497e8 the displacement ?
So to get the address inside the brackets is this the order we should follow?
In my understanding [base + index * scales] is used to fetch values inside and array. The base is the pointer to the first element in the array. The index is literally the index where the value we want is stored And the scale is the size of the date the array contains
My issue is when you add displacement in the equation, what is the displacement used for? And what does it mean when the displacement has a negative value ?
Upvotes: 2
Views: 2306
Reputation: 365277
Don't be fooled by the terminology. "base" has a specific technical meaning, and the "base" component of an addressing mode doesn't have to be the start of an array. e.g. [esp + 16 + esi*4]
could be indexing a local array that starts at esp+16
, even though esp
is the base register.
Similarly, the most obvious interpretation of [esi+ebx*4-0x4]
is array[i-1]
, with i
in EBX and esi
holding the start address of the array. It's an obvious optimization for the compiler to fold the -1
into the addressing mode instead of computing ebx-1
in another register and using that as the index.
And what does it mean when the displacement has a negative value?
It doesn't "mean" anything. The hardware just does binary addition and uses the result. It's up to the programmer (or compiler) to use an addressing mode that accesses the byte you want.
My answer on Referencing the contents of a memory location. (x86 addressing modes) has examples of when you might use every possible addressing mode for array indexing, with either a pointer to an array or a static array (so you can hard-code the array start address as an absolute displacement).
In technical x86 addressing mode terminology:
disp8
, or a disp32
. (In 64-bit addressing modes, the disp32
is sign-extended to 64 bits).offset: the result of the esi+ebx*4-0x4
calculation: the offset relative to the segment base. (In a normal flat memory model with base=0, the offset = the whole address).
People often use "offset" to describe the displacement, and usually there's no confusion because it's clear from context they're talking about a constant offset (using the English word offset in a sense other than x86 seg:off
), but I like to stick to "displacement" to describe the displacement.
base: the non-index register component of the addressing mode, if there is one. (The encoding for "no base register" instead means there's a disp32
, and you can think of that as a base. It implies the DS segment.)
This includes the case of having only an index and no base register: [esi*4]
can only be encoded as [dword 0 + esi*4]
.
imul eax,DWORD PTR [esi+ebx*4-0x4]
Yes, eax *= memory source operand
.
And yes, your address calculation is correct. Base + scaled index + signed displacement, resulting in a virtual address1.
"go to the address (result) and get the value inside it" is a weird way to describe it. "go to" would normally mean a control transfer, fetching the bytes as code. But that's not what happens, this is just a data load from that address, fully handled by hardware.
A modern x86 CPU (like Intel Skylake for example) decodes the imul eax, [esi+ebx*4 - 4]
into two uops: an imul ALU operation and a load. The ALU operation has to wait for the load result. (Fun fact: the two micro-ops are actually micro-fused into a single uop for most of the pipeline, except for in the out-of-order scheduler. See https://agner.org/optimize/ for more.)
When the load uop runs, the address-generation unit (AGU) gets the 2 register inputs, the index scale factor (left shift by 2), and the immediate displacement (-4
). The shift-and-add hardware in the AGU calculates the load address.
The next step inside the load execution unit is to use that address to load from L1d cache (which has the first-level L1dTLB virtual->physical cache basically built-in. L1d is virtually indexed, so the TLB lookup can happen in parallel with fetching the set of 8 tags+data from that way of L1d cache). Assuming a hit in the L1dTLB and L1d cache, the load execution unit receives a load result ~5 cycles later.
That load result is forwarded to the ALU execution unit as a source operand. The ALU doesn't care whether it was imul eax, ebx
or a memory source operand; that multiply uop is just dispatched to the ALU as soon as both input operands are ready.
jmp DWORD PTR [eax*4+0x80497e8]
Yes, eax *4
is the scaled index.
Yes, 0x80497e8
is the disp32 displacement. In this case, the displacement component of the addressing mode is probably being used as the address of a static jump table. You can think of it as the base for this addressing mode.
jump to that address
Nope, load a new EIP value from that address. It's a memory-indirect jump because of the square brackets.
What you described would be
lea eax, [eax*4+0x80497e8] ; address calc
jmp eax ; jump to code at that address
There's no way to do a computed jump in one instruction, you always need the new EIP value to be in a register or fetched as data from memory.
Footnote 1: We're assuming a flat memory model (segment base = 0), so we can ignore segmentation, like normal for code running under a normal OS like Linux, Windows, OS X, or pretty much any 32 or 64-bit OS. So the address calculation gives you a linear address.
I'm also assuming that paging is enabled, like normal under a mainstream OS, so it's a virtual address that has to be translated to physical, by the page tables cached by the TLB.
Upvotes: 5