Reputation: 126
Recently I tried to build a complex C++ application. App was built. But when I tried to launch it it crashed with illegal memory access. Actually it was null pointer with offset.
I started to investigate the reason. Crash happens while dynamically loading a library (libXcursor.so) with dlopen. So the reason was null pointer link_map->l_versions
which wasn't checked in the process of relocation inside elf_machine_rela
function (particularry in RESOLVE_MAP
macro). This happens here: map->l_versions[ndx]
ElfW(Half) ndx = version[ELFW(R_SYM) (r->r_info)] & 0x7fff;
elf_machine_rel (map, r, &symtab[ELFW(R_SYM) (r->r_info)],
&map->l_versions[ndx],
(void *) (l_addr + r->r_offset), skip_ifunc);
Aforementioned link_map
is a dependency description of a library in loading process. This dependency was not completely initialized as I can see. But what is more strange is that this is not actually a dependency: it's a libgthread
library DSO. And libXcursor.so indeed does not depend on it.
The mentioned link_map
of libgthread
is passed for relocation from dl_open_worker:
_dl_relocate_object (l, l->l_scope, reloc_mode, 0);
Beeing collected to the list of dependecies this way:
l = new;
do
{
if (! l->l_real->l_relocated)
maps[nmaps++] = l;
l = l->l_next;
}
while (l != NULL);
Here new
is the found link_map
of libXcursor.so itself.
And this piece of code is very strange IMHO.
Because at the same time other deps are actually initialized with this code:
/* Load that object's dependencies. */
_dl_map_object_deps (new, NULL, 0, 0,
mode & (__RTLD_DLOPEN | RTLD_DEEPBIND | __RTLD_AUDIT));
/* So far, so good. Now check the versions. */
for (unsigned int i = 0; i < new->l_searchlist.r_nlist; ++i)
if (new->l_searchlist.r_list[i]->l_real->l_versions == NULL)
(void) _dl_check_map_versions (new->l_searchlist.r_list[i]->l_real,
0, 0);
So question is why is other list of dependencies passed and no additional checks are made in such an important place (system loader). And probably where to fix the crash.
Used libdl's version is /lib64/libdl-2.27.so from glibc-2.27-38.fc28.x86_64
Seems like problem was fixed in here, here, there and there. It does not reproduces. Small app can't reproduce original crash with version 2.27 either.
Upvotes: 3
Views: 588
Reputation: 126
So as a result of trying with new glibc we have the following.
Seems like problem was fixed in here, here, there and there. It is not reproduced.
A little further analysis showed that the culprit of the crash was library libqt5_shim.so
. The app in question is chromium of latest version. So libqt5_shim.so
was its freshly build dependency which in turn depends on Qt5Core, Qt5Gui and Qt5Widgets. For some reason it was build with expectation of having version 5.15 of Qt. But the installed system Qt has just version 5.11.
So the loading process looked like this
file=libQt5Core.so.5 [0]; needed by /mnt/t/develop/chromium/chromium/src/out/default/libqt5_shim.s
file=libQt5Core.so.5 [0]; generating link map
file=libpcre2-16.so.0 [0]; needed by /lib64/libQt5Core.so.5 [0]
...
file=libgthread-2.0.so.0 [0]; needed by /lib64/libQt5Core.so.5 [0]
/lib64/libQt5Core.so.5: error: version lookup error: version `Qt_5.15' not found (required by /mnt/t/develop/chromium/chromium/src/out/default/libqt5_shim.so) (fatal)
file=/mnt/t/develop/chromium/chromium/src/out/default/libqt5_shim.so [0]; destroying link map
file=/lib64/libQt5Gui.so.5 [0]; destroying link map
file=/lib64/libQt5Core.so.5 [0]; destroying link map
.... ; destroying link map
file=/lib64/libgthread-2.0.so.0 [0]; destroying link map
.... further destroying
file=/mnt/t/develop/chromium/chromium/src/out/default/libqt6_shim.so [0]; dynamically loaded by /mnt/t/develop/chromium/chromium/src/out/default/libui_qt.so [0]
file=/mnt/t/develop/chromium/chromium/src/out/default/libqt6_shim.so [0]; generating link map
After that I suppose there were some dangling references in the dl linker map. Further work was incorrect. libdl was finding incorrect dependencies. As for reproducable example, it also crashes if Qt5Core is of inappropriate version but for another reason: while doing init of libpthread it catches SEGFAULT. Here is the code
#include <dlfcn.h>
#include <iostream>
int main() {
void* res = dlopen("libqt5_shim.so", RTLD_LAZY);
std::cout << "Open libQt5Core res=" << res << std::endl;
// res = dlopen("libgdk-3.so.0", RTLD_LAZY);
// std::cout << "Open libgdk res=" << res << std::endl;
res = dlopen("libXcursor.so.1", RTLD_LAZY);
std::cout << "Open libXcursor res=" << res << std::endl;
}
Deleting the system Qt5Core partially solves the problem for a while. Both crahses don't reproduce with glibc-2.32 from chrooted Fedora 33 installation.
Upvotes: 2