Bala
Bala

Reputation: 367

What does R_X86_64_IRELATIV mean?

I built an executable for a simple program, by statically linking libc library, in x86 arch. The relocation table for that executable is empty as expected:

$ readelf -r test
There are no relocations in this file.
$ 

While when I built an executable for the same program, by statically linking libc library, in x86_64 arch, the relocation table is not empty:

$ readelf -r test

Relocation section '.rela.plt' at offset 0x1d8 contains 12 entries:

  Offset          Info           Type           Sym. Value    Sym. Name + Addend
0000006c2058  000000000025 R_X86_64_IRELATIV                    000000000042de70
0000006c2050  000000000025 R_X86_64_IRELATIV                    00000000004829d0
0000006c2048  000000000025 R_X86_64_IRELATIV                    000000000042dfe0
0000006c2040  000000000025 R_X86_64_IRELATIV                    000000000040a330
0000006c2038  000000000025 R_X86_64_IRELATIV                    0000000000432520
0000006c2030  000000000025 R_X86_64_IRELATIV                    0000000000409ef0
0000006c2028  000000000025 R_X86_64_IRELATIV                    0000000000445ca0
0000006c2020  000000000025 R_X86_64_IRELATIV                    0000000000437f40
0000006c2018  000000000025 R_X86_64_IRELATIV                    00000000004323b0
0000006c2010  000000000025 R_X86_64_IRELATIV                    0000000000430540
0000006c2008  000000000025 R_X86_64_IRELATIV                    0000000000430210
0000006c2000  000000000025 R_X86_64_IRELATIV                    0000000000432400
$

I googled up relocation type "R_X86_64_IRELATIV" but I could find any info about it. So can someone please tell me what does it mean?

I thought if I debug the executable with gdb I might find an answer. But rather it actually brought up lot of questions :) Here is my bit of analysis:

The Sym.Name field in the above table lists the virtual address of some libc functions. When I objdump'd executable 'test' I found virtual address 0x430210 contains strcpy function. While on loading the corresponding PLT entry found at location 0x6c2008 gets changed from 0x400326 (virtual addr of next instruction ie)setting up the resolver) to 0x0x443cc0 (virtual addr of a libc function named __strcpy_sse2_unaligned) I dont why it gets resolved to a different function instead of strcpy? I assume its a different variant of strcpy.

Having done this analysis I realized I missed the basic point upfront "How come dynamic linker can come into picture when loading a static executable?" I dont find a .interp section so dynamic linker is not involved for sure. Then I observed, a libc function "__libc_csu_irel()" modifies the PLT entries and NOT dynamic linker.

If my analysis makes more sense to anyone, please let me know whats it all about. I would be happy to know the reasons behind it.

Thanks a lot!!!

Upvotes: 4

Views: 1794

Answers (3)

Peter Cordes
Peter Cordes

Reputation: 364180

I dont why it gets resolved to a different function instead of strcpy? I assume its a different variant of strcpy.

glibc uses the dynamic linker to select an optimal version of strcpy, strlen, memcpy, etc. for the host CPU at runtime.

The actual strcpy function is the dispatcher / selector that checks CPU features and sets things up so future calls go straight to the best version for your CPU. I wasn't sure how much of this mechanism still worked with static linking, this suggests it does.

For strcpy, I think __strcpy_sse2_unaligned is probably still optimal on modern CPUs (if there isn't an AVX2 version).

__strcpy_ssse3 uses SSSE3 palignr to do aligned loads and aligned stores, even if the src and dst are misaligned relative to each other. (It has 16 different loops, for all 16 possible relative alignments because palignr takes the shift count as an immediate, so it's kinda bloated.) It might be good on Core2, but later CPUs with more efficient unaligned loads/stores in hardware are probably best with the __strcpy_sse2_unaligned implementation.

Upvotes: 0

Qeek
Qeek

Reputation: 1970

TL;DR

You are right. Those relocations just trying to find out what implementation of (not only) libc functions should be used. They are resolved before the main is executed by the function __libc_start_main inserted in the binary at the linking time.


I will try to explain how this relocation type works.

The example

I am using this code as reference

//test.c
#include <stdio.h>
#include <string.h>

int main(void)
{
    char tmp[10];
    char target[10];
    fgets(tmp, 10, stdin);
    strcpy(target, tmp);
}

compiled with GCC 7.3.1

gcc -O0 -g -no-pie -fno-pie -o test -static test.c

The shorten output of relocation table (readelf -r test):

Relocation section '.rela.plt' at offset 0x1d8 contains 21 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
...
00000069bfd8  000000000025 R_X86_64_IRELATIV                    415fe0
00000069c018  000000000025 R_X86_64_IRELATIV                    416060

The shorten output of the section headers (readelf -S test):

[Nr] Name              Type             Address           Offset
     Size              EntSize          Flags  Link  Info  Align
...
[19] .got.plt          PROGBITS         000000000069c000  0009c000
     0000000000000020  0000000000000008  WA       0     0     8
...

It says that .got.plt section is on the address 0x69c000.

How is R_X86_64_IRELATIV relocation resolved

Every record in the relocation table contains two important information offset and addend. In the words the addend is pointer to function (also called indirect function) which takes no arguments and returns pointer to function. The returned pointer is placed on the offset from the relocation record.

Simple realocation resolver implementation:

void reolve_reloc(uintptr_t* offset, void* (*addend)())
{
    //addend is pointer to function
    *offset = addend();
}

From the example at the start of this answer. The last addend from the relocation table points to the address 0x416060 which is function strcpy_ifunc. See the output from disassembly:

0000000000416060 <strcpy_ifunc>:
  416060:       f6 05 05 8d 28 00 10    testb  $0x10,0x288d05(%rip)        # 69ed6c <_dl_x86_cpu_features+0x4c>
  416067:       75 27                   jne    416090 <strcpy_ifunc+0x30>
  416069:       f6 05 c1 8c 28 00 02    testb  $0x2,0x288cc1(%rip)        # 69ed31 <_dl_x86_cpu_features+0x11>
  416070:       75 0e                   jne    416080 <strcpy_ifunc+0x20>
  416072:       48 c7 c0 70 dd 42 00    mov    $0x42dd70,%rax
  416079:       c3                      retq   
  41607a:       66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)
  416080:       48 c7 c0 30 df 42 00    mov    $0x42df30,%rax
  416087:       c3                      retq   
  416088:       0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
  41608f:       00 
  416090:       48 c7 c0 f0 0e 43 00    mov    $0x430ef0,%rax
  416097:       c3                      retq   
  416098:       0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
  41609f:       00 

The strcpy_ifunc pick the best alternative of all strcpy implementations adn returns pointer on it. In my case it return address 0x430ef0 which is __strcpy_sse2_unaligned. This address is ten put at 0x69c018 which is at .glob.plt + 0x18

Who and when resolve it

Usually the first thought with reallocation is that all this stuff handles dynamic interpreter (ldd). But in this case the program is statically linked and the .interp section is empty. In this case it resolved in the function __libc_start_main which is part of the GLIBC. Except solving relocation this function also take care of passing command line argument to your main and do some other stuff.

Access to the relocation table

When I figure it out i had last question, how the __libc_start_main access the relocation table saved in the ELF headers? The first thought was it somehow opens the running binary for reading and process it. Of course this is totally wrong. If you look at the program header of the executable you will see something like this (readlef -l test):

Type           Offset             VirtAddr           PhysAddr
               FileSiz            MemSiz              Flags  Align
LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
               0x0000000000098451 0x0000000000098451  R E    0x200000
...

The offset in this header is offset from the first byte of the executable file. So what the first item in the program header says is copy first 0x98451 bytes of the test file into memory. But on the offset 0x0 is ELF header. So with code segment it will also load ELF headers into memory and __libc_start_main can easily access it.

Upvotes: 5

Omer Boehm
Omer Boehm

Reputation: 21

You can take a look at the "System V Application Binary Interface AMD64 Architecture Processor Supplement" - I found it under https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf

If you go to the the relocation section (4.4) you'll find the documentation for this RLD type and also an explanation of the calculation method

R_X86_64_IRELATIVE 37 wordclass indirect (B + A)

where

  • wordclass specifies word64 for LP64 and specifies word32 for ILP32.
  • A Represents the addend used to compute the value of the relocatable field.
  • B Represents the base address at which a shared object has been loaded into memory during execution. Generally, a shared object is built with a 0 base virtual address, but the execution address will be different.

goodluck - BTW thank you for the great post at sploitfun ;-)

Upvotes: 2

Related Questions