Some Name
Some Name

Reputation: 9521

How does dynamic linker know which library to search for a symbol?

I'm experimenting with LD_PRELOAD/dlopen and faced a confusion regarding symbol lookup. Consider the following 2 libraries:

  1. libshar

shared.h

int sum(int a, int b);

shared.c

int sum(int a, int b){
    return a + b;
}
  1. libshar2

shared.h

int sum(int a, int b);

shared.c

int sum(int a, int b){
    return a + b + 10000;
}

and executable bin_shared:

#include <dlfcn.h>
#include "shared.h"

int main(void){
    void *handle = dlopen("/home/me/c/build/libshar2.so", RTLD_NOW | RTLD_GLOBAL);
    int s = sum(2 + 3);
    printf("s = %d", s);
}

linking the binary with libshar and libdl I considered the following 2 cases:

  1. LD_PRELOAD is empty

The program prints 5.

Why does the dynamic linker decide to lookup the sum function in the libshar, not libshar2? Both of them are loaded and contain the needed symbol:

0x7ffff73dc000     0x7ffff73dd000     0x1000        0x0 /home/me/c/build/libshar2.so
0x7ffff73dd000     0x7ffff75dc000   0x1ff000     0x1000 /home/me/c/build/libshar2.so
0x7ffff75dc000     0x7ffff75dd000     0x1000        0x0 /home/me/c/build/libshar2.so
0x7ffff75dd000     0x7ffff75de000     0x1000     0x1000 /home/me/c/build/libshar2.so
#...
0x7ffff7bd3000     0x7ffff7bd4000     0x1000        0x0 /home/me/c/build/libshar.so
0x7ffff7bd4000     0x7ffff7dd3000   0x1ff000     0x1000 /home/me/c/build/libshar.so
0x7ffff7dd3000     0x7ffff7dd4000     0x1000        0x0 /home/me/c/build/libshar.so
0x7ffff7dd4000     0x7ffff7dd5000     0x1000     0x1000 /home/me/c/build/libshar.so
  1. LD_PRELOAD = /path/to/libshar2.so

The program prints 10005. This is expected, but again I noticed that both libshar.so and libshar2.so are loaded:

0x7ffff79d1000     0x7ffff79d2000     0x1000        0x0 /home/me/c/build/libshar.so
0x7ffff79d2000     0x7ffff7bd1000   0x1ff000     0x1000 /home/me/c/build/libshar.so
0x7ffff7bd1000     0x7ffff7bd2000     0x1000        0x0 /home/me/c/build/libshar.so
0x7ffff7bd2000     0x7ffff7bd3000     0x1000     0x1000 /home/me/c/build/libshar.so
0x7ffff7bd3000     0x7ffff7bd4000     0x1000        0x0 /home/me/c/build/libshar2.so
0x7ffff7bd4000     0x7ffff7dd3000   0x1ff000     0x1000 /home/me/c/build/libshar2.so
0x7ffff7dd3000     0x7ffff7dd4000     0x1000        0x0 /home/me/c/build/libshar2.so
0x7ffff7dd4000     0x7ffff7dd5000     0x1000     0x1000 /home/me/c/build/libshar2.so

The LD_PRELOAD case seems to be explained in ld.so(8):

LD_PRELOAD

A list of additional, user-specified, ELF shared objects to be loaded before all others. The items of the list can be separated by spaces or colons. This can be used to selectively override functions in other shared objects. The objects are searched for using the rules given under DESCRIPTION.

Upvotes: 7

Views: 1766

Answers (2)

R.. GitHub STOP HELPING ICE
R.. GitHub STOP HELPING ICE

Reputation: 215173

dlopen can't (nor can anything else) change the definition of (global) symbols already present at the time of the call. It can only make available new ones that did not exist before.

The (sloppy) formalization of this is in the specification for dlopen:

Symbols introduced into the process image through calls to dlopen() may be used in relocation activities. Symbols so introduced may duplicate symbols already defined by the program or previous dlopen() operations. To resolve the ambiguities such a situation might present, the resolution of a symbol reference to symbol definition is based on a symbol resolution order. Two such resolution orders are defined: load order and dependency order. Load order establishes an ordering among symbol definitions, such that the first definition loaded (including definitions from the process image file and any dependent executable object files loaded with it) has priority over executable object files added later (by dlopen()). Load ordering is used in relocation processing. Dependency ordering uses a breadth-first order starting with a given executable object file, then all of its dependencies, then any dependents of those, iterating until all dependencies are satisfied. With the exception of the global symbol table handle obtained via a dlopen() operation with a null pointer as the file argument, dependency ordering is used by the dlsym() function. Load ordering is used in dlsym() operations upon the global symbol table handle.

Note that LD_PRELOAD is nonstandard functionality and thus not described here, but on implementations that offer it, LD_PRELOAD acts with load order after the main program but before any shared libraries loaded as dependencies.

Upvotes: 1

Employed Russian
Employed Russian

Reputation: 213375

Why does the dynamic linker decide to lookup the sum function in the libshar, not libshar2?

Dynamic linkers on UNIX attempt to emulate what would have happened if you linked with archive libraries.

In the case of empty LD_PRELOAD, the symbol search order is (when the symbol is referenced by the main binary; rules get more complicated when the symbol is referenced by the DSO): the main binary, directly linked DSOs in the order they are listed on the link line, dlopened DSOs in the order they were dlopened.

LD_PRELOAD = /path/to/libshar2.so The program prints 10005. This is expected,

Non-empty LD_PRELOAD modifies the search order by inserting any libraries listed after the main executable, and before any directly linked DSOs.

but again I noticed that both libshar.so and libshar2.so are loaded:

Why is that a surprise? The dynamic linker loads all libraries listed in LD_PRELOAD, and then all libraries that you directly linked against (as explained before).

Upvotes: 5

Related Questions