Michael WS
Michael WS

Reputation: 2617

core dump from pyinstaller binary

Is there anyway to use gdb to analyse a core dump created by a pyinstaller binary? I am packaging python/C++ files into one binary and gdb cannot read the symbols from python or the binary.

I have tried and receive only question marks from gdb.

gdb $(which python) -c core 
gdb my_binary -c core  

Upvotes: 2

Views: 1944

Answers (3)

Thomas Waldmann
Thomas Waldmann

Reputation: 11

It's fixed meanwhile: https://github.com/pyinstaller/pyinstaller/pull/2178/files

Quite recently, also the bootloaders were recompiled and committed to the repo, so it NOW should work.

Upvotes: 1

Eirik Fuller
Eirik Fuller

Reputation: 1514

Using the python binary with a core file from a binary generated by pyinstaller is incorrect. Output from the file command will confirm this:

core: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from './build_id/build/build_id/build_id', 

(I omitted the end of that output line for brevity).

The core file used with that file command came from an attempt to use pyinstaller on build_id.py, hence the occurrence of the name build_id within the pathname.

Assuming my_binary represents the result of your attempt to use pyinstaller, it is the correct binary to use with a core file from that attempt. I have compared build IDs from my core file against all of the mapped files, and they all match. Here is output from a verbose invocation of my build_id.py on my core file:

0000000000400000 63116679c3030438046175bc610d21cbd50fbac0 /home/eirik/git/pyinstaller/build_id/build/build_id/build_id
00007f042f80d000 377b0152081112c82460680fe99ec01aa090cd81 /lib/x86_64-linux-gnu/libc-2.24.so
00007f042fbab000 adcc4a5e27d5de8f0bc3c6021b50ba2c35ec9a8e /lib/x86_64-linux-gnu/libz.so.1.2.8
00007f042fdc6000 4e43c23036c6bfd2d4dab183e458e29d87234adc /lib/x86_64-linux-gnu/libdl-2.24.so
00007f042ffca000 a731640ef1cd73c1d727c2a9521b10cafec33c15 /lib/x86_64-linux-gnu/ld-2.24.so

Output from the file command on any one of those pathnames reports the same build ID seen in the output line for that file. Furthermore, the build ID reported for my build_id binary matches the build ID for the following file, provided by pyinstaller:

PyInstaller/bootloader/Linux-64bit/run

It appears that pyinstaller uses objcopy to build a binary using that run file as a prefix. In addition to reporting the build ID of that run file, the file command also tells me it's stripped, which is consistent with the backtrace I get from gdb:

Core was generated by `./build_id/build/build_id/build_id'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  strlen () at ../sysdeps/x86_64/strlen.S:106
106 ../sysdeps/x86_64/strlen.S: No such file or directory.
(gdb) backtrace
#0  strlen () at ../sysdeps/x86_64/strlen.S:106
#1  0x00007f042f8424d9 in __add_to_environ (name=0x405da8 "LD_LIBRARY_PATH_ORIG", value=0x0, combined=0x0, replace=1) at setenv.c:131
#2  0x0000000000404688 in ?? ()
#3  0x0000000000402db3 in ?? ()
#4  0x00007f042f82d2b1 in __libc_start_main (main=0x401950, argc=1, argv=0x7fffc57143c8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
    stack_end=0x7fffc57143b8) at ../csu/libc-start.c:291
#5  0x000000000040197e in ?? ()
(gdb) x/i $pc
=> 0x7f042f88d496 <strlen+38>:      movdqu (%rax),%xmm4
(gdb) i r rax
rax            0x0  0
(gdb) 

Something in the bootloader (that run file) is calling __add_to_environ (perhaps via setenv, which has a jmpq to __add_to_environ), which then passes a null pointer to strlen (presumably that null pointer originated within run). All of the occurrences of ?? in that backtrace are from the run file (their addresses all begin with 0x00000000004).

If I run the binary with LD_LIBRARY_PATH=/tmp, I no longer get a core dump, so I'm guessing that null pointer is the return value of getenv("LD_LIBRARY_PATH"). Also, I see this in bootloader/src/pyi_utils.c (in set_dynamic_library_path):

    orig_path = pyi_getenv(env_var);
    pyi_setenv(env_var_orig, orig_path);

The best hope of making sense of a crash like this, short of relying on debug symbols from libc, might be to build the bootloader (the run file) from source code and set aside its debug symbols and load those into gdb, invoked upon an executable built from that bootloader, and a core file from it. It's important in such a case to use the bootloader built from source rather than the one distributed with pyinstaller (both with gdb, and to build the binary), unless the build ID happens to match (between the distributed run and the one built from source). If using a stripped bootloader is unnecessary, it might be easier to just not strip the run file, but that might be more work than loading the debug symbols separately.

It might make sense for pyinstaller to include debug symbols for each included bootloader (for all I know, it does that and I just haven't found them yet, but offhand I doubt that). If you want gdb to use debug symbols for libc, part of that might involve installing a separate package. On my Debian system that package is libc6-dbg. I briefly looked into building the pyinstaller bootloader with debug symbols, but I haven't finished deciphering waf yet.

Upvotes: 1

ks1322
ks1322

Reputation: 35706

I have tried and receive only question marks from gdb.

There could be 2 reasons of this:

  1. You probably mismatched core dump and binary which generated it. Try these commands to find out exact path to binary: file core or gdb -c core. By the way it should be path to python binary.
  2. gdb can't find symbols and unable to show meaningful function names in stack trace. Try to install debug symbols for python in your OS with apt-get or yum/dnf.

Upvotes: 0

Related Questions