Reputation: 65
I am learning about linking and found a small question that I could not understand.
Consider the following files:
main.c
#include "other.h"
extern int i;
int main() {
++i;
inci();
return 0;
}
other.c
int i = 0;
void inci() {
++i;
}
Then I compile these two files:
gcc -c main.c
gcc -shared -fpic other.c -o libother.so
gcc -o main main.o ./libother.so
Here is part of the dissasemble of main.o
:
f: 8b 05 00 00 00 00 mov 0x0(%rip),%eax # 15 <main+0x15>
15: 83 c0 01 add $0x1,%eax
18: 89 05 00 00 00 00 mov %eax,0x0(%rip) # 1e <main+0x1e>
1e: b8 00 00 00 00 mov $0x0,%eax
23: e8 00 00 00 00 call 28 <main+0x28>
Here is part of the disassemble of main
:
1148: 8b 05 ca 2e 00 00 mov 0x2eca(%rip),%eax # 4018 <i@@Base>
114e: 83 c0 01 add $0x1,%eax
1151: 89 05 c1 2e 00 00 mov %eax,0x2ec1(%rip) # 4018 <i@@Base>
1157: b8 00 00 00 00 mov $0x0,%eax
115c: e8 cf fe ff ff call 1030 <inci@plt>
They both correspond to C code:
++i;
According to the assembly, it seems that the linker has already decided the run-time address of i
, because it is using a PC-relative address to reference it directly, rather than using GOT. However, as far as I know, the shared library is only loaded into memory when the program uses it loads. Thus, the executable main
should have no knowledge about the address of i
at link time. Then, how does the linker determine that i
is located at 0x4020?
Also what does the comment i@@Base
mean?
Upvotes: 3
Views: 438
Reputation: 213955
According to the assembly, it seems that the linker has already decided the run-time address of i, because it is using a PC-relative address to reference it directly, rather than using GOT.
Correct.
However, as far as I know, the shared library is only loaded into memory when the program uses it loads.
Correct, except the i
variable in the shared library is never used, and so its address doesn't matter.
What happens here is described pretty well in Solaris documentation:
Suppose the link-editor is used to create a dynamic executable, and a reference to a data item is found to reside in one of the dependent shared objects. Space is allocated in the dynamic executable's .bss, equivalent in size to the data item found in the shared object. This space is also assigned the same symbolic name as defined in the shared object. Along with this data allocation, the link-editor generates a special copy relocation record that instructs the runtime linker to copy the data from the shared object to the allocated space within the dynamic executable.
Because the symbol assigned to this space is global, it is used to satisfy any references from any shared objects. The dynamic executable inherits the data item. Any other objects within the process that make reference to this item are bound to this copy. The original data from which the copy is made effectively becomes unused.
You can observe this using readelf -Ws main
:
Symbol table '.dynsym' contains 5 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
...
2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND inci
4: 0000000000404024 4 OBJECT GLOBAL DEFAULT 25 i
Note that the inci()
is undefined (it's defined in libother.so
), but i
is defined in the main
as a global symbol, and readelf -Wr main
:
Relocation section '.rela.dyn' at offset 0x4d8 contains 3 entries:
Offset Info Type Symbol's Value Symbol's Name + Addend
...
0000000000404024 0000000400000005 R_X86_64_COPY 0000000000404024 i + 0
Relocation section '.rela.plt' at offset 0x520 contains 1 entry:
Offset Info Type Symbol's Value Symbol's Name + Addend
0000000000404018 0000000200000007 R_X86_64_JUMP_SLOT 0000000000000000 inci + 0
Upvotes: 3