Reputation: 4411
Why are some relocation entries in an ELF file symbol name + addend
while others are section + addend
? I am looking to clear up some confusion and gain a deeper understanding of ELFs. Below is my investigation.
I have a very simple C file, test.c
:
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
static void func1(void)
{
fprintf(stdout, "Inside func1\n");
}
// ... a couple other simple *static* functions
int main (void)
{
func1();
// ... call some other functions
exit(EXIT_SUCCESS);
}
I then compile this into an object file with:
clang -O0 -Wall -g -c test.c -o test.o
If look at the relocations with readelf -r test.o
I see the entries that refer to my static functions as follows (this one is picked from the .rela.debug_info
section):
Offset Info Type Symbol's Value Symbol's Name + Addend
...
000000000000006f 0000000400000001 R_X86_64_64 0000000000000000 .text + b0
...
Why are these functions referred to as section + addend
rather than symbol name + addend
? I see entries for the functions in the .symtab
using readelf -s test.o
:
Num: Value Size Type Bind Vis Ndx Name
...
2: 00000000000000b0 31 FUNC LOCAL DEFAULT 2 func1
...
Additionally, when I disassemble the object file (via objdump -d
), I see that the functions are there and weren't optimized into main
or anything.
If I don't make the functions static and then look at the relocations, I see the same as before when the type is R_X86_64_64
, but I also see entries that use the symbol name plus an addend with type R_X86_64_PC32
. So for example in .rela.text
:
Offset Info Type Symbol's Value Symbol's Name + Addend
...
00000000000000fe 0000001200000002 R_X86_64_PC32 0000000000000000 func1 + 1c
...
Please let me know if more examples/readelf output would be helpful. Thank you for taking the time to read this.
Upvotes: 3
Views: 864
Reputation: 4411
Eli Bendersky's blog also mentions this in his blog post. From the section titled "Extra credit: Why was the call relocation needed?":
In short, however, when ml_util_func is global, it may be overridden in the executable or another shared library, so when linking our shared library, the linker can't just assume the offset is known and hard-code it [12]. It makes all references to global symbols relocatable in order to allow the dynamic loader to decide how to resolve them. This is why declaring the function static makes a difference - since it's no longer global or exported, the linker can hard-code its offset in the code.
The full post should be read to get complete context, but I thought I would share it here as it presents better examples than in my question and reinforces the solution that Employed Russian gave.
Upvotes: 0
Reputation: 213957
Why are these functions referred to as section + addend rather than symbol name + addend?
The function names for static functions are not guaranteed to be present at link time. You could remove them with e.g. objcopy --strip-unneeded
or objcopy --strip-symbol
, and the result will still link.
I see entries for the functions in the
.symtab
usingreadelf -s test.o
I believe the only reason they are kept is to help debugging, and they are not used by the linker at all. But I have not verified this by looking at linker source, and so did not answer this related question.
Upvotes: 2