glenjoker
glenjoker

Reputation: 91

What is the difference between MOV and LEA in terms of retrieving an address

What exactly is the difference between mov and lea when I use them to get an address?

Let's say if I have a program printing out a character string starting from its 5th character whose code is shown below:

section .text
    global _start
_start:
    mov edx, 0x06  ;the length of msg from its 5th char to the last is 6.
    lea ecx, [msg + 4]
    mov ebx, 1
    mov eax, 4
    int 0x80

section .data
msg db '1234567890'

Then, if I swap lea ecx, [msg + 4] for mov ecx, msg + 4, would it run differently?

I tried both and the outputs appeared to be the same. However, I read from this link, What's the purpose of the LEA instruction?, in the comment section of this first answer, it seemed that someone claimed that something like mov ecx, msg + 4 was invalid, but I failed to see it. Can someone help me to understand this? Thanks in advance!

Upvotes: 5

Views: 2657

Answers (1)

Peter Cordes
Peter Cordes

Reputation: 364180

When the absolute address is a link-time constant, mov r32, imm32 and lea r32, [addr] will both get the job done. The imm32 can be any valid NASM expression. In this case msg + 4 is a link-time constant. The linker will find the final address of msg, add 4 to it (because the placeholder in the .o had the +4 as the displacement). That final value replaces the 4B placeholder when copying the bytes from the .o to the linker output.

Exactly the same thing happens to the 4B displacement in lea's effective address.


mov has a slightly shorter encoding (no ModRM byte), and can run on more execution ports. See also https://uops.info/ (Assembled+linked, then disassembled into GAS intel syntax, not NASM.)

08049000 <foo>:
 8049000:       b8 00 90 04 08          mov    eax,0x8049000
 8049005:       8d 05 00 90 04 08       lea    eax,ds:0x8049000

Use mov reg, imm unless you can take advantage of lea to do some useful math with registers at the same time. (for example: lea ecx, [msg + 4 + eax*4 + edx])


In 64-bit mode, where RIP-relative addressing is possible, using LEA lets you make efficient position-independent code (that doesn't need to be modified if mapped to a different virtual address). There's no way to achieve that functionality with mov. See How to load address of function or label into register and Referencing the contents of a memory location. (x86 addressing modes)

Also see the tag wiki for many good links.


Also note that you can use a symbolic constant for the size. You can also format and comment your code better. (indenting the operands looks less messy in code that has some instructions with longer mnemonics).

section .text
    global _start
_start:
    mov    edx, msgsize - 4
    mov    ecx, msg + 4     ; In MASM syntax, this would be mov ecx, OFFSET msg + 4
    mov    ebx, 1       ; stdout
    mov    eax, 4       ; NR_write
    int    0x80         ; write(1, msg+4, msgsize-4)

    mov    eax, 1       ; NR_exit
    xor    ecx, ecx
    int    0x80         ; exit(0)
    ;; otherwise execution falls through into non-code and segfaults

section .rodata
msg db '1234567890'     ; note, not null-terminated, and no newline
msgsize equ $-msg       ; current position - start of message

Upvotes: 8

Related Questions