Assembly 32-bit addressing size instead of 64-bit in 64-bit Mode

Question

i have a question about 32-bit addressing size instead of 64-bit addressing size in 64-bit Mode

func:
    movzx    eax, al  ; instead of movzx rax, al
    mov      eax, DWORD [4 * eax + .data]  ; instead of mov rax, QWORD [8 * rax + .data]
    ret

    .data:
         DD .DATA1     ; instead of DQ
         DD .DATA2     ; instead of DQ
         DD .DATA3     ; instead of DQ
         DD .DATA4     ; instead of DQ

.DATA1     DB 'HEY1', 0x00
.DATA2     DB 'HEY2', 0x00
.DATA3     DB 'HEY3', 0x00
.DATA4     DB 'HEY4', 0x00

is it a safe way in 64-bit ? because i think in 64-bit and addressing like this, there is no problem ! (i do this because .data)

i think .data and each item address is fit for 32-Bit registers if the program size (executable) be less than about 100 Mb which always is !

Peter Cordes · Accepted Answer

This would be unsafe on x86-64 MacOS for example, or in a Linux PIE executable. Program size isn't the only factor because it's not loaded starting at virtual address 0. The first byte of your program may be at something like 0x555555555000, so truncating an address to 32 bit would break you code no matter how small your program is.

(You'd get an invalid relocation linker error from using [.data + rax*4] in that case, though, just from using .data as an absolute disp32. 32-bit absolute addresses no longer allowed in x86-64 Linux?). But if you'd used [edi + eax*4] with a valid pointer in RDI, you could write code that would assemble but crash in a PIE executable or a MacOS executable.)

But yes, the default non-PIE Linux code model places all code and static data in the low 2GiB of virtual address space so 32-bit absolute sign- or zero-extended numbers can represent addresses.

Your data in memory is the same size regardless of how you address it, so your alternatives are

 movzx    eax, al
 mov      eax, DWORD [4 * eax + table_of_32bit_pointers]  ; pointless
 mov      eax, DWORD [4 * rax + table_of_32bit_pointers]  ; good

 ; RAX holds a zero-extended pointer.

mov rax, QWORD [8 * rax + .data] would load 8 bytes from a different location. You're still mixing up address size and operand-size.

Using compact 32-bit pointers in memory doesn't mean you have to use 32-bit address size when you load them.

Like I explained in your previous question there's no reason to use 32-bit address-size after zero-extending an index to 64-bit with movzx eax, al. (BTW, prefer movzx ecx, al; mov-elimination only works between different registers.)

BTW, if your strings are all the same length, or you can pad them to fixed length cheaply, you don't need a table of pointers. You can instead just compute the address from the start of the first string + scaled index. e.g. p = .DATA1 + idx*5 in this case, where your strings are 5 bytes long each.

lea  eax, [.DATA1 + RAX + RAX*4]    ; 4+1 = 5
; eax points at the selected 5-byte string buffer

Also, don't use .data as a symbol name. It's the name of a section so that's going to get confusing.

Assembly 32-bit addressing size instead of 64-bit in 64-bit Mode

Answers (1)

Related Questions