Reputation: 61
I've been looking at some disassembly of some ELF binaries and I noticed this:
0000000000401020 <_start>:
401020: 31 ed xor ebp,ebp
401022: 49 89 d1 mov r9,rdx
401025: 5e pop rsi
401026: 48 89 e2 mov rdx,rsp
401029: 48 83 e4 f0 and rsp,0xfffffffffffffff0
40102d: 50 push rax
40102e: 54 push rsp
40102f: 49 c7 c0 30 13 40 00 mov r8,0x401330
401036: 48 c7 c1 d0 12 40 00 mov rcx,0x4012d0
40103d: 48 c7 c7 72 12 40 00 mov rdi,0x401272
401044: ff 15 a6 2f 00 00 call QWORD PTR [rip+0x2fa6] # 403ff0 <__libc_start_main@GLIBC_2.2.5>
40104a: f4 hlt
40104b: 0f 1f 44 00 00 nop DWORD PTR [rax+rax*1+0x0]
When __libc_start_main
gets called, we have those three immediate values passed via registers as parameters. Those are obviously function pointers that get called in __libc_start_main
(including main
). But these are virtual addresses, and my understanding is that the actual mapped address of the binary when it's loaded into memory and running will not necessarily be the same. So, these function pointers may not reflect their actual location in memory.
Being more acquainted with PE files, the IMAGE_DIRECTORY_BASERELOC
section provides us with IMAGE_BASE_RELOCATION
structures that help us adjust these constant values to reflect the new image base. But I don't see any equivalent of that for ELF files. Am I missing something here? How do these addresses get fixed when an ELF file is loaded?
Upvotes: 2
Views: 1062
Reputation: 364408
and my understanding is that the actual mapped address of the binary when it's loaded into memory and running will not necessarily be the same.
Nope, from those addresses we can see that this is a non-PIE ELF executable linked at ld
's default base address. This is a position-dependent executable.
The executable itself will always be loaded at a fixed virtual address, so static addresses can be put into registers using 32-bit immediates instead of RIP-relative LEA. ASLR for the executable itself is not allowed / possible.
libc is an ELF "shared object" that can be ALSRed, hence the call to __libc_start_main
via a pointer in the GOT. In gcc source for this CRT start code, this probably looks like call *__libc_start_main@GOTPCREL(%rip)
(AT&T syntax).
And BTW, we can tell this was hand-written asm, from the missed optimization of using 7-byte mov rdi, sign_extended_imm32
(same size as RIP-relative LEA) instead of 5-byte mov edi, imm32
. The default non-PIE code-model in the x86-64 System V ABI puts all static code/data in the low 2GiB of virtual address space, so static addresses can be used with zero- or sign-extension to 64-bit.
ELF "executables" that can be loaded at a randomized base address are called PIE (Position Independent Executable). In terms of ELF details, they use the same ELF "type" as shared libraries, so they are in fact ELF shared objects that have an "entry point" and are marked as executable.
Modern Linux distros have gcc defaulting to building PIEs. See 32-bit absolute addresses no longer allowed in x86-64 Linux? (relocatable ELF shared objects can be relocated anywhere in the address space, not restricted to the low 2GiB, so there's no relocation-type for runtime fixups of 32-bit absolute addresses.)
There is a relocation type for 64-bit absolute addresses, so jump tables (of function/code pointers) are still possible, and so is 10-byte mov rdi, imm64
, but that's less efficient than a RIP-relative LEA even if it wasn't for the ELF program loader or dynamic linker having to modify the program text for these relocations.
e.g. readelf -a /bin/ls
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: DYN (Shared object file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x5ae0
...
Note the Type field: DYN, the same as from an actual library like readelf -a /lib/libc.so.6
. And the entry point is a relative address, relative to base address it's mapped at.
A non-PIE executable (e.g. statically linked, or build with -fno-pie -no-pie
) looks like this:
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x401000
Note the Type: EXEC
and the absolute entry point (chosen at link-time by ld
).
Upvotes: 3