drcxd
drcxd

Reputation: 65

How does linker resolve references to data object in shared libraries at link time?

I am learning about linking and found a small question that I could not understand.

Consider the following files:

main.c

#include "other.h"
extern int i;
int main() {
  ++i;
  inci();
  return 0;
}

other.c

int i = 0;
void inci() {
  ++i;
}

Then I compile these two files:

gcc -c main.c
gcc -shared -fpic other.c -o libother.so
gcc -o main main.o ./libother.so

Here is part of the dissasemble of main.o:

   f:   8b 05 00 00 00 00       mov    0x0(%rip),%eax        # 15 <main+0x15>
  15:   83 c0 01                add    $0x1,%eax
  18:   89 05 00 00 00 00       mov    %eax,0x0(%rip)        # 1e <main+0x1e>
  1e:   b8 00 00 00 00          mov    $0x0,%eax
  23:   e8 00 00 00 00          call   28 <main+0x28>

Here is part of the disassemble of main:

    1148:   8b 05 ca 2e 00 00       mov    0x2eca(%rip),%eax        # 4018 <i@@Base>
    114e:   83 c0 01                add    $0x1,%eax
    1151:   89 05 c1 2e 00 00       mov    %eax,0x2ec1(%rip)        # 4018 <i@@Base>
    1157:   b8 00 00 00 00          mov    $0x0,%eax
    115c:   e8 cf fe ff ff          call   1030 <inci@plt>

They both correspond to C code:

++i;

According to the assembly, it seems that the linker has already decided the run-time address of i, because it is using a PC-relative address to reference it directly, rather than using GOT. However, as far as I know, the shared library is only loaded into memory when the program uses it loads. Thus, the executable main should have no knowledge about the address of i at link time. Then, how does the linker determine that i is located at 0x4020?

Also what does the comment i@@Base mean?

Upvotes: 3

Views: 438

Answers (1)

Employed Russian
Employed Russian

Reputation: 213955

According to the assembly, it seems that the linker has already decided the run-time address of i, because it is using a PC-relative address to reference it directly, rather than using GOT.

Correct.

However, as far as I know, the shared library is only loaded into memory when the program uses it loads.

Correct, except the i variable in the shared library is never used, and so its address doesn't matter.

What happens here is described pretty well in Solaris documentation:

Suppose the link-editor is used to create a dynamic executable, and a reference to a data item is found to reside in one of the dependent shared objects. Space is allocated in the dynamic executable's .bss, equivalent in size to the data item found in the shared object. This space is also assigned the same symbolic name as defined in the shared object. Along with this data allocation, the link-editor generates a special copy relocation record that instructs the runtime linker to copy the data from the shared object to the allocated space within the dynamic executable.

Because the symbol assigned to this space is global, it is used to satisfy any references from any shared objects. The dynamic executable inherits the data item. Any other objects within the process that make reference to this item are bound to this copy. The original data from which the copy is made effectively becomes unused.

You can observe this using readelf -Ws main:

Symbol table '.dynsym' contains 5 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
...
     2: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND inci
     4: 0000000000404024     4 OBJECT  GLOBAL DEFAULT   25 i

Note that the inci() is undefined (it's defined in libother.so), but i is defined in the main as a global symbol, and readelf -Wr main:

Relocation section '.rela.dyn' at offset 0x4d8 contains 3 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
...
0000000000404024  0000000400000005 R_X86_64_COPY          0000000000404024 i + 0

Relocation section '.rela.plt' at offset 0x520 contains 1 entry:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000404018  0000000200000007 R_X86_64_JUMP_SLOT     0000000000000000 inci + 0

Upvotes: 3

Related Questions