What is the primary reason for using PLT & GOT tables for shared libraries?

Question

I'm reading Ian Lance Taylor's essay on Linkers: http://inai.de/documents/Linkers.pdf

When discussing shared objects around page 9, he mentions that since shared libraries can be loaded into a process at an unpredictable virtual address, a dynamic linker would need to process a large amount of relocations once the address is known. This would slow down loading. In order to avoid this large number of relocations being done by the dynamic linker, the program linker changes function references into PC-relative calls into the PLT table, and global/static variable references are turned into references into the GOT table. Then the dynamic linker only needs to relocate the entries in the PLT/GOT on load time, and not process relocations in the entire binary.

However, this focus on load-time optimization is confusing me, because there seems to be a much more glaring issue here, and speeding up loading is beside the point. The whole point of shared objects is that a single shared object loaded into physical memory can now be mapped into the virtual address space of each process that needs it. This can be done quickly by changing some page tables, and avoids loading a new copy of the library from disk.

So if the dynamic linker did any relocations in the main body of the shared library, these changes would appear in every other process that also has that shared library mapped, and they would break the library if it appears at a different virtual address.

And it's for this reason that we have a GOT and PLT. The program linker modifies all references into position-independent references into the GOT and PLT. And then the dynamic linker relocates the entries in the GOT and PLT uniquely for each process. The main contents of the shared library are shared across each process, but the GOT and PLT is unique for each process and is not shared.

Is this understanding of the PLT and GOT correct? I've inferred some of the mechanisms here based on my understanding, but I don't see any other way that it could work.

Employed Russian · Accepted Answer

You appear to be missing or not understanding the concept of copy-on-write (CoW) pages.

Two processes could mmap the same file on disk into their distinct virtual addresses, and the OS can use a single physical page of RAM for both mappings (that is, the processes share a single physical memory page). But as soon as one process changes the memory, a copy is made for that process, and the changes do not appear in the other process (the physical memory pages are no longer shared).

So if the dynamic linker did any relocations in the main body of the shared library, these changes would appear in every other process that also has that shared library mapped,

Not if the memory is CoW.

And it's for this reason that we have a GOT and PLT

No, the reason is optimization (many fewer pages would have to be copied) and not correctness as your (mis)understanding implies.

What is the primary reason for using PLT & GOT tables for shared libraries?

Answers (1)

Related Questions

What is the primary reason for using PLT &amp; GOT tables for shared libraries?

Answers (1)

Related Questions

What is the primary reason for using PLT & GOT tables for shared libraries?