csteifel
csteifel

Reputation: 2934

Why is there a relocation needed if calling function in same translation unit

So I have two files one is my library and one is a main prog executable. Library:

static int internal1(int a, int b){
  return a + b;
}

namespace {
  int internal2(int a, int b){
    return a + b;
  }
}

void external2(int qq, int zz){

}

void external(int a, int b){
  external2(a, b);
  internal1(a, b);
  internal2(a, b);
}

Compiled with g++ -c -O0 -fPIC -o libtest.o libtest.cpp and g++ -shared -o libtest.so libtest.o

Main prog:

extern void external(int a, int b);

int main(){
  external(1, 2);
  return 0;
}

Compiled with g++ -O0 -L. -ltest -o tester tester.cpp

Now if I dump the relocation information for tester I get what I expect:

Relocation section '.rela.dyn' at offset 0x4d0 contains 1 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000600a48  000100000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0

Relocation section '.rela.plt' at offset 0x4e8 contains 3 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000600a68  000300000007 R_X86_64_JUMP_SLO 0000000000000000 __libc_start_main + 0
000000600a70  000400000007 R_X86_64_JUMP_SLO 0000000000000000 _Z8externalii + 0
000000600a78  000a00000007 R_X86_64_JUMP_SLO 0000000000400578 __gxx_personality_v0 + 0

and external is on the relocation list since it has to find the address and put it in.

However what I don't understand is when I dump the relocation list of the shared object why I see external2 on the relocation list of the shared object. Why is it not just automatically putting the address in like it did for the functions with internal linkage.

Relocation section '.rela.dyn' at offset 0x460 contains 5 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
0000002007d8  000000000008 R_X86_64_RELATIVE                    00000000002007d8
000000200990  000200000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0
000000200998  000300000006 R_X86_64_GLOB_DAT 0000000000000000 _Jv_RegisterClasses + 0
0000002009a0  000400000006 R_X86_64_GLOB_DAT 0000000000000000 __cxa_finalize + 0
0000002009d0  000500000001 R_X86_64_64       0000000000000000 __gxx_personality_v0 + 0

Relocation section '.rela.plt' at offset 0x4d8 contains 2 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
0000002009c0  000400000007 R_X86_64_JUMP_SLO 0000000000000000 __cxa_finalize + 0
0000002009c8  000600000007 R_X86_64_JUMP_SLO 0000000000000646 _Z9external2ii + 0

The calls to internal1 and internal2 don't require relocation, why does external2 being an external symbol mean it has to now do a look up via GOT an plt even though the symbol is within the same translation unit? Why can't it just do a normal offset call like it does for the internals

Upvotes: 5

Views: 550

Answers (1)

user4442671
user4442671

Reputation:

Out of curiosity, let's see what the compiler does when you try to just compile (not even link) the same code with most optimizations enabled. This often exposes situations where the compiler's hands are tied. Also, compilers can afford to be very sloppy/lazy in O0, so I avoid reading too much into it.

Compiling libtest.cpp with -O3 -fPIC -c yields:

external2(int, int):
        ret
external(int, int):
        jmp     external2(int, int)@PLT

see on godbolt: https://gcc.godbolt.org/z/CnRVEX

This is very interesting: GCC can obviously tell that external2() is a no-op, yet it still calls it under O3.

What can we conclude from this? That invoking external2() will not necessarily execute the code in the TU's version of external2(). But how is this possible? ODR should allow us to assume that any external2() in the same binary is at worse equivalent to the one in this TU.

This is true at the C++ level, but Linux does not load C++ code. It loads elfs, which play by a different set of rules. One of these rules is that you can use LD_PRELOAD to load symbols before the executable in order to intercept them. And by making the symbol external in PIC code, the linker interprets it as an overloadable symbol, which prevents inlining (in my example), as well as local jumps (in yours).

Upvotes: 5

Related Questions