Making changes to the dynamic library

Question

Suppose we have a C program that uses the shared library.

If you make changes to a shared library and rebuild it, all programs using that library will automatically receive those changes the next time they run. In the case of a static library, the changes become visible only after the program is recompiled with the new version of the library.

If we change the code of some shared library functions (without changing method signatures), add new functions, etc., the addresses of the functions will change.

How will a program that uses our shared library find these functions again without recompiling and relinking? If their addresses have changed.

k1r1t0 · Accepted Answer

TL; DR;

As I explained below you should not just recompile the library and put it into use to already running binary. But I think you can use dlopen(2) API to manually manipulate needed shared library, so once you recompiled that library you just need to synchronize your already running code (just like the simplest RCU does) and new shared library.

You also can use inotify(2) API to check if your library was modified and once it is, reload symbols.

A little bit of theory

Usually (especially on the Linux) your program is run by the dynamic loader (on the Linux it usually placed to /lib64/ld-linux-x86-64.so.2 for x86-64 architecture and to /lib/ld-linux.so.* for 32-bit version).

Once you run your program by calling (shell will do that for you) exec* function family (let's say you use ELF -- Executable and Linkable Format, usually you do) the kernel will, at some point -- load_elf_binary for the Linux -- read ELF section called .interp (that section contains the path to a dynamic loader -- in my case that is /lib/ld-linux-x86-64.so.2) and load a linker to the memory. Afterwards, the kernel will prepare auxv (auxiliary vector) value (especially AT_ENTRY will be set to the entry point for your program, also see getauxval(2)) and dynamic loader will read AT_ENTRY value by calling getauxval(2) to obtain the entry point of your program to later take over the control to your program.

Before the dynamic loader will be able to take over the control it must check if your program depends on some shared libraries. Then a loader have to load them as well, otherwise your program will fail since it doesn't know where the function is. Usually the process is done by calling mmap(2), mprotect(2), etc. system calls to make sure that those libraries will be shared and to not load them again.

Now the most interesting moment. Because the dynamic loader loaded those libraries by calling mmap(2) it has few options -- lazy loading (controlled by LD_BIND_NOW environmnet variable) or populate it immediately. The first means that the kernel will just prepare VMAs for this library, so the physical pages will be loaded once the function from a specific library is called (library's code/data is accessed). The second case means that all physical pages must be loaded right away (the advantage of this method is that the program can be faster because it doesn't need to wait each page while it's executing -- there will be no page faults once the program is running, but the cost is the startup time, which will be increased, because pages will be populated at startup of the program).

That means code of shared libraries may be loaded to the memory or may not. This leads us to an assumption that we can modify the library before it loaded to memory. NO, we cannot, because ELF specific headers have already been read by the kernel, so once you recompiled the library your program is likely to fail, because the .text, .data and other sections can change its offsets of the file which leads to incorrect loaded pages, which leads to very undesirable behavior of the system.

Making changes to the dynamic library

Answers (1)

TL; DR;

A little bit of theory

Related Questions