Reputation: 5912
Possible Duplicate:
What's the purpose of the LEA instruction?
When I need the value at an address I can use the effective address e.g. push dword [str+4]
. But when I need to reference an address -- I can't use push dword str+4
(which to me is the obvious and intuitive way to do it).
Instead need to use lea EAX, [str+4]
and then push EAX
. This a bit confusing and also gives an extra processor instruction, albeit a 'zero-clock' one. (See this answer)
Is there some hardware level explanation for this difference, or is it just a quirk of (NASM) assembly syntax?
Edit:
Okay so this comment asks the same question as me. And it is answered in this comment just as Lucero's answer - the X86 does not support such addressing.
(Editor's note: that's not the same question. str+4
is a link-time constant, unlike EBX + 8*EAX + 4
which involves registers.)
Upvotes: 3
Views: 2523
Reputation: 64904
This is more of a long comment (since it doesn't answer the question), but readers ought to know..
lea
most certainly is not a zero-clock instruction. There are some of those, such as fxch
(on everything with register renaming), nop
(90
and 0F 1F
) on Sandy Bridge, and certain idioms for setting a register to zero (xor
or sub
with itself, even for XMM registers), also on Sandy Bridge, and mov
-ing registers to each other (but curiously not a no-op mov with the same 64-bit register as source and destination) on more recent processors. "Eliminated" moves and zeroing idioms still have a limited throughput (even if the limit is very high such as 5 per cycle), so they're not entirely free. Even more recently, Intel Golden Cove added additions with small immediate values to the operations handled during renaming, making them appear to have zero latency under some circumstances.
lea
always takes at least one cycle so far (potentially some forms of lea
, such as lea r1, [r2 + imm8]
could be handled by the front-end on future processors, similar to add r, imm8
on Golden Cove), it is commonly executed on an ALU instead of an AGU (some AMD's and Atom are exceptions) but even in the cases where it's executed on an AGU it still takes a cycle or more. lea
can even take more than 1 cycle, such as scaled lea
on P4, Sandy Bridge (seems like I'm mentioning SB a lot in this post..) or AMD processors. In fact, on AMD K10 the lea
that goes to the AGU is the slow case, where it's scaled and/or has 3 arguments and takes a cycle longer than the fast one, which goes to an ALU.
Upvotes: 2
Reputation: 60190
push str+4
does work in 32-bit code, assuming str
is a normal symbol/label like str:
, not a macro alias for a register like %define str edi
or something.
On a symbol address, str+4
is computed by the linker while building the executable, using a relocation entry in the .o
object file created by NASM. The machine code for push str+4
includes the same 4-byte absolute address (as an imm32) that lea eax, [str+4]
does (as a disp32 in the addressing mode). (This is not 64-bit code so RIP-relative addressing with default rel
isn't a possibility.)
If str
actually is a macro for a register, then the +4
computation would have to happen at run-time and you would indeed need a separate instruction from the push
.
Assembly instructions directly represent x86 opcodes (no transforming compilation takes place as in higher-level languages). The opcodes have their limitations in what they can represent; as such, while address computations are possible as part of the x86 addressing, value computations are not. LEA covers this gap by storing the result of the address computation in any register instead of only consuming it internally.
Upvotes: 2
Reputation: 35803
Because that starts to look like C. The only place you can use that sort of inline addition is when addressing memory. LEA
lets you "address" memory without addressing it, which can be very useful in protected mode where a small pointer misstep will kill your application (and maybe even better in real mode where a pointer misstep might kill DOS, Windows, the machine, and kill any number of things). Assembly is a limited beast in which each instruction corresponds to a physical circuit. That the instructions are a general as they are is a small miracle of its own.
Upvotes: -1
Reputation: 941455
Just use the correct syntax, you need the offset keyword:
push offset str+4
The LEA instruction is handy to use the plumbing of the address generation logic. Giving very cheap ways to add and multiply that don't use the ALU. High on the list of tricks for programmers that write code generators. Not needed here, afaict.
Upvotes: 3