Reputation: 367
I built an executable for a simple program, by statically linking libc library, in x86 arch. The relocation table for that executable is empty as expected:
$ readelf -r test There are no relocations in this file. $
While when I built an executable for the same program, by statically linking libc library, in x86_64 arch, the relocation table is not empty:
$ readelf -r test Relocation section '.rela.plt' at offset 0x1d8 contains 12 entries: Offset Info Type Sym. Value Sym. Name + Addend 0000006c2058 000000000025 R_X86_64_IRELATIV 000000000042de70 0000006c2050 000000000025 R_X86_64_IRELATIV 00000000004829d0 0000006c2048 000000000025 R_X86_64_IRELATIV 000000000042dfe0 0000006c2040 000000000025 R_X86_64_IRELATIV 000000000040a330 0000006c2038 000000000025 R_X86_64_IRELATIV 0000000000432520 0000006c2030 000000000025 R_X86_64_IRELATIV 0000000000409ef0 0000006c2028 000000000025 R_X86_64_IRELATIV 0000000000445ca0 0000006c2020 000000000025 R_X86_64_IRELATIV 0000000000437f40 0000006c2018 000000000025 R_X86_64_IRELATIV 00000000004323b0 0000006c2010 000000000025 R_X86_64_IRELATIV 0000000000430540 0000006c2008 000000000025 R_X86_64_IRELATIV 0000000000430210 0000006c2000 000000000025 R_X86_64_IRELATIV 0000000000432400 $
I googled up relocation type "R_X86_64_IRELATIV" but I could find any info about it. So can someone please tell me what does it mean?
I thought if I debug the executable with gdb I might find an answer. But rather it actually brought up lot of questions :) Here is my bit of analysis:
The Sym.Name field in the above table lists the virtual address of some libc functions. When I objdump'd executable 'test' I found virtual address 0x430210 contains strcpy function. While on loading the corresponding PLT entry found at location 0x6c2008 gets changed from 0x400326 (virtual addr of next instruction ie)setting up the resolver) to 0x0x443cc0 (virtual addr of a libc function named __strcpy_sse2_unaligned) I dont why it gets resolved to a different function instead of strcpy? I assume its a different variant of strcpy.
Having done this analysis I realized I missed the basic point upfront "How come dynamic linker can come into picture when loading a static executable?" I dont find a .interp section so dynamic linker is not involved for sure. Then I observed, a libc function "__libc_csu_irel()" modifies the PLT entries and NOT dynamic linker.
If my analysis makes more sense to anyone, please let me know whats it all about. I would be happy to know the reasons behind it.
Thanks a lot!!!
Upvotes: 4
Views: 1794
Reputation: 364180
I dont why it gets resolved to a different function instead of strcpy? I assume its a different variant of strcpy.
glibc uses the dynamic linker to select an optimal version of strcpy, strlen, memcpy, etc. for the host CPU at runtime.
The actual strcpy
function is the dispatcher / selector that checks CPU features and sets things up so future calls go straight to the best version for your CPU.
I wasn't sure how much of this mechanism still worked with static linking, this suggests it does.
For strcpy
, I think __strcpy_sse2_unaligned
is probably still optimal on modern CPUs (if there isn't an AVX2 version).
__strcpy_ssse3
uses SSSE3 palignr
to do aligned loads and aligned stores, even if the src and dst are misaligned relative to each other. (It has 16 different loops, for all 16 possible relative alignments because palignr
takes the shift count as an immediate, so it's kinda bloated.) It might be good on Core2, but later CPUs with more efficient unaligned loads/stores in hardware are probably best with the __strcpy_sse2_unaligned
implementation.
Upvotes: 0
Reputation: 1970
You are right. Those relocations just trying to find out what implementation of (not only) libc functions should be used. They are resolved before the main
is executed by the function __libc_start_main
inserted in the binary at the linking time.
I will try to explain how this relocation type works.
I am using this code as reference
//test.c
#include <stdio.h>
#include <string.h>
int main(void)
{
char tmp[10];
char target[10];
fgets(tmp, 10, stdin);
strcpy(target, tmp);
}
compiled with GCC 7.3.1
gcc -O0 -g -no-pie -fno-pie -o test -static test.c
The shorten output of relocation table (readelf -r test
):
Relocation section '.rela.plt' at offset 0x1d8 contains 21 entries:
Offset Info Type Sym. Value Sym. Name + Addend
...
00000069bfd8 000000000025 R_X86_64_IRELATIV 415fe0
00000069c018 000000000025 R_X86_64_IRELATIV 416060
The shorten output of the section headers (readelf -S test
):
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
...
[19] .got.plt PROGBITS 000000000069c000 0009c000
0000000000000020 0000000000000008 WA 0 0 8
...
It says that .got.plt
section is on the address 0x69c000
.
Every record in the relocation table contains two important information offset and addend. In the words the addend is pointer to function (also called indirect function) which takes no arguments and returns pointer to function. The returned pointer is placed on the offset from the relocation record.
Simple realocation resolver implementation:
void reolve_reloc(uintptr_t* offset, void* (*addend)())
{
//addend is pointer to function
*offset = addend();
}
From the example at the start of this answer. The last addend from the relocation table points to the address 0x416060
which is function strcpy_ifunc
. See the output from disassembly:
0000000000416060 <strcpy_ifunc>:
416060: f6 05 05 8d 28 00 10 testb $0x10,0x288d05(%rip) # 69ed6c <_dl_x86_cpu_features+0x4c>
416067: 75 27 jne 416090 <strcpy_ifunc+0x30>
416069: f6 05 c1 8c 28 00 02 testb $0x2,0x288cc1(%rip) # 69ed31 <_dl_x86_cpu_features+0x11>
416070: 75 0e jne 416080 <strcpy_ifunc+0x20>
416072: 48 c7 c0 70 dd 42 00 mov $0x42dd70,%rax
416079: c3 retq
41607a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
416080: 48 c7 c0 30 df 42 00 mov $0x42df30,%rax
416087: c3 retq
416088: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
41608f: 00
416090: 48 c7 c0 f0 0e 43 00 mov $0x430ef0,%rax
416097: c3 retq
416098: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
41609f: 00
The strcpy_ifunc
pick the best alternative of all strcpy
implementations adn returns pointer on it. In my case it return address 0x430ef0
which is
__strcpy_sse2_unaligned
. This address is ten put at 0x69c018
which is at .glob.plt + 0x18
Usually the first thought with reallocation is that all this stuff handles dynamic interpreter (ldd
). But in this case the program is statically linked and the .interp
section is empty. In this case it resolved in the function __libc_start_main
which is part of the GLIBC. Except solving relocation this function also take care of passing command line argument to your main
and do some other stuff.
When I figure it out i had last question, how the __libc_start_main
access the relocation table saved in the ELF headers? The first thought was it somehow opens the running binary for reading and process it. Of course this is totally wrong. If you look at the program header of the executable you will see something like this (readlef -l test
):
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x0000000000098451 0x0000000000098451 R E 0x200000
...
The offset in this header is offset from the first byte of the executable file. So what the first item in the program header says is copy first 0x98451 bytes of the test
file into memory. But on the offset 0x0 is ELF header. So with code segment it will also load ELF headers into memory and __libc_start_main
can easily access it.
Upvotes: 5
Reputation: 21
You can take a look at the "System V Application Binary Interface AMD64 Architecture Processor Supplement" - I found it under https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf
If you go to the the relocation section (4.4) you'll find the documentation for this RLD type and also an explanation of the calculation method
R_X86_64_IRELATIVE 37 wordclass indirect (B + A)
where
goodluck - BTW thank you for the great post at sploitfun ;-)
Upvotes: 2