peachykeen
peachykeen

Reputation: 4411

Why are some relocations .text + addend instead of symbol's name + addend?

Why are some relocation entries in an ELF file symbol name + addend while others are section + addend? I am looking to clear up some confusion and gain a deeper understanding of ELFs. Below is my investigation.

I have a very simple C file, test.c:

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>

static void func1(void)
{
    fprintf(stdout, "Inside func1\n");
}

// ... a couple other simple *static* functions

int main (void)
{
    func1();

    // ... call some other functions

    exit(EXIT_SUCCESS);
}

I then compile this into an object file with:

clang -O0 -Wall -g -c test.c -o test.o

If look at the relocations with readelf -r test.o I see the entries that refer to my static functions as follows (this one is picked from the .rela.debug_info section):

Offset            Info              Type         Symbol's Value    Symbol's Name + Addend
...
000000000000006f  0000000400000001  R_X86_64_64  0000000000000000 .text + b0
...

Why are these functions referred to as section + addend rather than symbol name + addend? I see entries for the functions in the .symtab using readelf -s test.o:

Num: Value            Size Type Bind  Vis     Ndx Name
  ...
  2: 00000000000000b0 31   FUNC LOCAL DEFAULT 2   func1
  ...

Additionally, when I disassemble the object file (via objdump -d), I see that the functions are there and weren't optimized into main or anything.

If I don't make the functions static and then look at the relocations, I see the same as before when the type is R_X86_64_64, but I also see entries that use the symbol name plus an addend with type R_X86_64_PC32. So for example in .rela.text:

Offset            Info              Type           Symbol's Value    Symbol's Name + Addend
...
00000000000000fe  0000001200000002  R_X86_64_PC32  0000000000000000  func1 + 1c
...

Please let me know if more examples/readelf output would be helpful. Thank you for taking the time to read this.

Upvotes: 3

Views: 864

Answers (2)

peachykeen
peachykeen

Reputation: 4411

Eli Bendersky's blog also mentions this in his blog post. From the section titled "Extra credit: Why was the call relocation needed?":

In short, however, when ml_util_func is global, it may be overridden in the executable or another shared library, so when linking our shared library, the linker can't just assume the offset is known and hard-code it [12]. It makes all references to global symbols relocatable in order to allow the dynamic loader to decide how to resolve them. This is why declaring the function static makes a difference - since it's no longer global or exported, the linker can hard-code its offset in the code.

The full post should be read to get complete context, but I thought I would share it here as it presents better examples than in my question and reinforces the solution that Employed Russian gave.

Upvotes: 0

Employed Russian
Employed Russian

Reputation: 213957

Why are these functions referred to as section + addend rather than symbol name + addend?

The function names for static functions are not guaranteed to be present at link time. You could remove them with e.g. objcopy --strip-unneeded or objcopy --strip-symbol, and the result will still link.

I see entries for the functions in the .symtab using readelf -s test.o

I believe the only reason they are kept is to help debugging, and they are not used by the linker at all. But I have not verified this by looking at linker source, and so did not answer this related question.

Upvotes: 2

Related Questions