qff
qff

Reputation: 5912

Push the address of something+constant in NASM? Why is LEA required?

Possible Duplicate:
What's the purpose of the LEA instruction?

When I need the value at an address I can use the effective address e.g. push dword [str+4]. But when I need to reference an address -- I can't use push dword str+4 (which to me is the obvious and intuitive way to do it).

Instead need to use lea EAX, [str+4] and then push EAX. This a bit confusing and also gives an extra processor instruction, albeit a 'zero-clock' one. (See this answer)

Is there some hardware level explanation for this difference, or is it just a quirk of (NASM) assembly syntax?

Edit: Okay so this comment asks the same question as me. And it is answered in this comment just as Lucero's answer - the X86 does not support such addressing.
(Editor's note: that's not the same question. str+4 is a link-time constant, unlike EBX + 8*EAX + 4 which involves registers.)

Upvotes: 3

Views: 2523

Answers (4)

user555045
user555045

Reputation: 64904

This is more of a long comment (since it doesn't answer the question), but readers ought to know..

lea most certainly is not a zero-clock instruction. There are some of those, such as fxch (on everything with register renaming), nop (90 and 0F 1F) on Sandy Bridge, and certain idioms for setting a register to zero (xor or sub with itself, even for XMM registers), also on Sandy Bridge, and mov-ing registers to each other (but curiously not a no-op mov with the same 64-bit register as source and destination) on more recent processors. "Eliminated" moves and zeroing idioms still have a limited throughput (even if the limit is very high such as 5 per cycle), so they're not entirely free. Even more recently, Intel Golden Cove added additions with small immediate values to the operations handled during renaming, making them appear to have zero latency under some circumstances.

lea always takes at least one cycle so far (potentially some forms of lea, such as lea r1, [r2 + imm8] could be handled by the front-end on future processors, similar to add r, imm8 on Golden Cove), it is commonly executed on an ALU instead of an AGU (some AMD's and Atom are exceptions) but even in the cases where it's executed on an AGU it still takes a cycle or more. lea can even take more than 1 cycle, such as scaled lea on P4, Sandy Bridge (seems like I'm mentioning SB a lot in this post..) or AMD processors. In fact, on AMD K10 the lea that goes to the AGU is the slow case, where it's scaled and/or has 3 arguments and takes a cycle longer than the fast one, which goes to an ALU.

Upvotes: 2

Lucero
Lucero

Reputation: 60190

push str+4 does work in 32-bit code, assuming str is a normal symbol/label like str:, not a macro alias for a register like %define str edi or something.

On a symbol address, str+4 is computed by the linker while building the executable, using a relocation entry in the .o object file created by NASM. The machine code for push str+4 includes the same 4-byte absolute address (as an imm32) that lea eax, [str+4] does (as a disp32 in the addressing mode). (This is not 64-bit code so RIP-relative addressing with default rel isn't a possibility.)


If str actually is a macro for a register, then the +4 computation would have to happen at run-time and you would indeed need a separate instruction from the push.

Assembly instructions directly represent x86 opcodes (no transforming compilation takes place as in higher-level languages). The opcodes have their limitations in what they can represent; as such, while address computations are possible as part of the x86 addressing, value computations are not. LEA covers this gap by storing the result of the address computation in any register instead of only consuming it internally.

Upvotes: 2

Linuxios
Linuxios

Reputation: 35803

Because that starts to look like C. The only place you can use that sort of inline addition is when addressing memory. LEA lets you "address" memory without addressing it, which can be very useful in protected mode where a small pointer misstep will kill your application (and maybe even better in real mode where a pointer misstep might kill DOS, Windows, the machine, and kill any number of things). Assembly is a limited beast in which each instruction corresponds to a physical circuit. That the instructions are a general as they are is a small miracle of its own.

Upvotes: -1

Hans Passant
Hans Passant

Reputation: 941455

Just use the correct syntax, you need the offset keyword:

 push offset str+4

The LEA instruction is handy to use the plumbing of the address generation logic. Giving very cheap ways to add and multiply that don't use the ALU. High on the list of tricks for programmers that write code generators. Not needed here, afaict.

Upvotes: 3

Related Questions