Reputation: 19431
I just debugged a strange issue where I have two libraries let's call it libA.so and libB.so
Application dlopens libA.so(EDIT: it isn't: it's linked by the -l option) which is a thin library which then loads libB.so which is the actual implementation.
The dlopen are called using the RTLD_NOW option, no other options are passed.
And both libraries use the same logger module where the logger's state is stored in a global variable, since both use the same logger and linked to them statically the global variable in both of them is of the same name.
When libB is loaded the two global variables are sitting at the same address and conflicting. So the dynamic linker reused the address of the variable to use the same variable in libB.
If it matters this variable is defined deep within a .cpp file, I'm not sure if linking between C and C++ is different.
Reading the dlopen's documentation it says:
RTLD_GLOBAL
The symbols defined by this library will be made available for symbol resolution of subsequently loaded libraries.
RTLD_LOCAL
This is the converse of RTLD_GLOBAL, and the default if neither flag is specified. Symbols defined in this library are not made available to resolve references in subsequently loaded libraries.
So RTLD_LOCAL is supposed to be the default, that is libA's symbols shouldn't be used when resolving libB's symbols. But it's still happening. Why?
As a workaround I added visibility("hidden") option to this global to avoid exporting. And raised a ticket to make all symbols hidden by default, so collisions like this shouldn't happen in the future, but I'm still wondering why this happens when it shouldn't.
EDIT2:
Source example:
commonvar.h:
#pragma once
#include <iostream>
struct A
{
A()
{
std::cout << "A inited. Address: " << this << "\n";
}
virtual ~A() {}
};
extern A object;
struct POD
{
int x, y, z;
};
extern POD pod;
commonvar.cpp:
#include <string>
#include "commonvar.h"
A object;
POD pod = {1, 2, 3};
a.h:
#pragma once
extern "C" void foo();
a.cpp:
#include <iostream>
#include "commonvar.h"
using FnFoo = void (*)();
extern "C" void foo()
{
std::cout << "A called.\n";
std::cout << "A: Address of foo is: " << &object << "\n";
std::cout << "A: Address of pod is: " << &pod << "\n";
std::cout << "A: {" << pod.x << ", " << pod.y << ", " << pod.z << "}\n";
pod.x = 42;
}
b.cpp:
#include <iostream>
#include <string>
#include "commonvar.h"
extern "C" void foo()
{
std::cout << "B called.\n";
std::cout << "B: Address of foo is: " << &object << "\n";
std::cout << "B: Address of pod is: " << &pod << "\n";
std::cout << "B: {" << pod.x << ", " << pod.y << ", " << pod.z << "}\n";
}
main.cpp:
#include <dlfcn.h>
#include <iostream>
#include <cassert>
#include "a.h"
using FnFoo = void (*)();
int main()
{
std::cout << "Start of program.\n";
foo();
std::cout << "Loading B\n";
void *b = dlopen("libb.so", RTLD_NOW);
assert(b);
FnFoo fnB;
fnB = FnFoo(dlsym(b, "foo"));
assert(fnB);
fnB();
}
Build script:
#!/bin/bash
g++ -fPIC -c commonvar.cpp
ar rcs common.a commonvar.o
g++ -fPIC -shared a.cpp common.a -o liba.so
g++ -fPIC -shared b.cpp common.a -o libb.so
g++ main.cpp liba.so -ldl -o main
Dynamic symbols of main:
U __assert_fail
0000000000202010 B __bss_start
U __cxa_atexit
w __cxa_finalize
U dlopen
U dlsym
0000000000202010 D _edata
0000000000202138 B _end
0000000000000bc4 T _fini
U foo
w __gmon_start__
0000000000000860 T _init
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
U __libc_start_main
U _ZNSt8ios_base4InitC1Ev
U _ZNSt8ios_base4InitD1Ev
0000000000202020 B _ZSt4cout
U _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
Dynamic symbols of liba.so:
0000000000202064 B __bss_start
U __cxa_atexit
w __cxa_finalize
0000000000202064 D _edata
0000000000202080 B _end
0000000000000e6c T _fini
0000000000000bba T foo
w __gmon_start__
0000000000000a30 T _init
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
0000000000202070 B object
0000000000202058 D pod
U _ZdlPvm
0000000000000dca W _ZN1AC1Ev
0000000000000dca W _ZN1AC2Ev
0000000000000e40 W _ZN1AD0Ev
0000000000000e22 W _ZN1AD1Ev
0000000000000e22 W _ZN1AD2Ev
U _ZNSolsEi
U _ZNSolsEPKv
U _ZNSt8ios_base4InitC1Ev
U _ZNSt8ios_base4InitD1Ev
U _ZSt4cout
U _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
0000000000201dd0 V _ZTI1A
0000000000000ed5 V _ZTS1A
0000000000201db0 V _ZTV1A
U _ZTVN10__cxxabiv117__class_type_infoE
Dynamic symbols of libb.so:
$ nm -D libb.so
0000000000202064 B __bss_start
U __cxa_atexit
w __cxa_finalize
0000000000202064 D _edata
0000000000202080 B _end
0000000000000e60 T _fini
0000000000000bba T foo
w __gmon_start__
0000000000000a30 T _init
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
0000000000202070 B object
0000000000202058 D pod
U _ZdlPvm
0000000000000dbe W _ZN1AC1Ev
0000000000000dbe W _ZN1AC2Ev
0000000000000e34 W _ZN1AD0Ev
0000000000000e16 W _ZN1AD1Ev
0000000000000e16 W _ZN1AD2Ev
U _ZNSolsEi
U _ZNSolsEPKv
U _ZNSt8ios_base4InitC1Ev
U _ZNSt8ios_base4InitD1Ev
U _ZSt4cout
U _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
0000000000201dd0 V _ZTI1A
0000000000000ec9 V _ZTS1A
0000000000201db0 V _ZTV1A
U _ZTVN10__cxxabiv117__class_type_infoE
Output:
A inited. Address: 0x7efd6cf97070
Start of program.
A called.
A: Address of foo is: 0x7efd6cf97070
A: Address of pod is: 0x7efd6cf97058
A: {1, 2, 3}
Loading B
A inited. Address: 0x7efd6cf97070
B called.
B: Address of foo is: 0x7efd6cf97070
B: Address of pod is: 0x7efd6cf97058
B: {42, 2, 3}
As it can be seen the addresses of the variables collide but the function's address doesn't.
Moreover the C++ initialization are peculiar: aggregates the pod
variable are initialized only once you can see that call to foo() modifies it, but when B is loaded it won't reinitialize it, but does call the constructor for the full object when libb.so is loaded.
Upvotes: 0
Views: 1604
Reputation: 701
A possible solution for this issue is using the RTLD_DEEPBIND
flag of dlopen (however, it's Linux specific, not POSIX standard), which will make the loaded library try to resolve symbols against itself (and its own dependencies) before going through the ones in the global scope.
For this to work properly, the executable has to be built with -fPIE
, otherwise some violated ODR assumptions made by libstdc++ will likely cause a segfault (alternatively, if iostream
is replaced with cstdio
, it works without -fPIE
).
Upvotes: 0
Reputation: 213375
The key to answering this question is whether the main executable exports the same symbol in its dynamic symbol table. That is, what is the output from:
nm -D a.out | grep ' mangled_name_of_the_symbol'
If the output is empty, the two libraries should indeed use separate (their own) copies of the symbol. But if the output is not empty, then both libraries should reuse the symbol defined in the main binary (this happens because UNIX dynamic linking attempts to emulate what would have happened if everything was statically linked into the main binary -- UNIX support for shared libraries happened long after UNIX itself became popular, and in that context this design decision made sense).
Demonstration:
// main.c
#include <assert.h>
#include <dlfcn.h>
#include <stdio.h>
int foo = 12;
int main()
{
printf("main: &foo = %p, foo = %d\n", &foo, foo);
void *h = dlopen("./foo.so", RTLD_NOW);
assert (h != NULL);
void (*fn)(void) = (void (*)()) dlsym(h, "fn");
fn();
return 0;
}
// foo.c
#include <assert.h>
#include <dlfcn.h>
#include <stdio.h>
int foo = 42;
void fn()
{
printf("foo: &foo = %p, foo = %d\n", &foo, foo);
void *h = dlopen("./bar.so", RTLD_NOW);
assert (h != NULL);
void (*fn)(void) = (void (*)()) dlsym(h, "fn");
fn();
}
// bar.c
#include <stdio.h>
int foo = 24;
void fn()
{
printf("bar: &foo = %p, foo = %d\n", &foo, foo);
}
Build this with:
gcc -fPIC -shared -o foo.so foo.c && gcc -fPIC -shared -o bar.so bar.c &&
gcc main.c -ldl && ./a.out
Output:
main: &foo = 0x5618f1d61048, foo = 12
foo: &foo = 0x7faad6955040, foo = 42
bar: &foo = 0x7faad6950028, foo = 24
Now rebuild just the main binary with -rdynamic
(which causes foo
to be exported from it): gcc main.c -ldl -rdynamic
. The output changes to:
main: &foo = 0x55ced88f1048, foo = 12
foo: &foo = 0x55ced88f1048, foo = 12
bar: &foo = 0x55ced88f1048, foo = 12
P.S. You can gain much insight into the behavior of dynamic linker by running with:
LD_DEBUG=symbols,bindings ./a.out
Update:
It turns out I asked a wrong question ... Added source example.
If you look at LD_DEBUG
output, you'll see:
165089: symbol=object; lookup in file=./main [0]
165089: symbol=object; lookup in file=./liba.so [0]
165089: binding file ./liba.so [0] to ./liba.so [0]: normal symbol `object'
165089: symbol=object; lookup in file=./main [0]
165089: symbol=object; lookup in file=./liba.so [0]
165089: binding file ./libb.so [0] to ./liba.so [0]: normal symbol `object'
What this means: liba.so
is in the global search list (by virtue of having been directly linked to by main
). This is approximately equivalent to having done dlopen("./liba.so", RTLD_GLOBAL)
.
It should not be a surprise then that the symbols in it are available for subsequently loaded shared libraries to bind to, which is exactly what the dynamic loader does.
Upvotes: 1