Reputation: 1321
I have a program, where main threads creates lots of threads. It crashed, and I'm debugging core file. Crash happened in one of child threads. In order to find the reason, I need to know whether the main thread is still alive. Is there any way to find out which thread was the initial one?
Upvotes: 4
Views: 1820
Reputation: 2817
As a general approach for UNIX-based systems, the accepted answer works as expected.
On Linux (and OSes that chose a similar POSIX threads implementation strategy), identifying the main thread can be much more straightforward. Typically, the file name of a core dump contains the PID of the faulting process (e.g. core.<pid>
) unless the core pattern (/proc/sys/kernel/core_pattern
) was changed. With that, you can reliably determine the main thread using thread find <pid>
:
$ gdb executable core.24533
[...]
(gdb) thread find 24533
Thread 7 has target id 'Thread 0x7f8ae2169740 (LWP 24533)'
(gdb) thread 7
[Switching to thread 7 (Thread 0x7f8ae2169740 (LWP 24533))]
#0 0x00007f8ae1d40017 in pthread_join (threadid=140234458433280, thread_return=0x0) at pthread_join.c:90
90 lll_wait_tid (pd->tid);
(gdb) bt
#0 0x00007f8ae1d40017 in pthread_join (threadid=140234458433280, thread_return=0x0) at pthread_join.c:90
#1 0x00007f8ae1ae40f7 in __gthread_join (__value_ptr=0x0, __threadid=<optimized out>)
at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/x86_64-redhat-linux/bits/gthr-default.h:668
#2 std::thread::join (this=this@entry=0x5595aac42990) at ../../../../../libstdc++-v3/src/c++11/thread.cc:107
#3 0x00005595a9681468 in operator() (t=..., __closure=<optimized out>) at segv.cxx:31
#4 for_each<__gnu_cxx::__normal_iterator<std::thread*, std::vector<std::thread> >, ThreadPool::wait()::__lambda1> (__last=..., __first=..., __f=...)
at /usr/include/c++/4.8.2/bits/stl_algo.h:4417
#5 wait (this=0x7ffcac67d860) at segv.cxx:32
#6 main (argc=<optimized out>, argv=<optimized out>) at segv.cxx:75
If the file name is missing the PID, it can be recovered from the core dump itself. The PID is stored in a note section (PT_NOTE
). Both, NT_PRSTATUS
and NT_PRPSINFO
contain the PID. In case of multiple threads, NT_PRSTATUS
exists for each individual thread including the main thread and the order is unspecified, NT_PRPSINFO
on the other hand exists only once.
The definition in case of Linux x86_64 (pr_pid
is our field of interest):
struct elf_prpsinfo
{
char pr_state; /* numeric process state */
char pr_sname; /* char for pr_state */
char pr_zomb; /* zombie */
char pr_nice; /* nice val */
unsigned long pr_flag; /* flags */
__kernel_uid_t pr_uid;
__kernel_gid_t pr_gid;
pid_t pr_pid, pr_ppid, pr_pgrp, pr_sid;
/* Lots missing */
char pr_fname[16]; /* filename of executable */
char pr_psargs[ELF_PRARGSZ]; /* initial part of arg list */
};
eu-readelf -n
(provided by elfutils
) can be used to extract the PID from NT_PRPSINFO
:
$ eu-readelf -n core
[...]
CORE 136 PRPSINFO
state: 2, sname: D, zomb: 0, nice: 0, flag: 0x0000000040402504
uid: 0, gid: 0, pid: 24533, ppid: 17322, pgrp: 24533, sid: 17299
^^^^^
fname: segv, psargs: ./segv 2
[...]
Upvotes: 2
Reputation: 213385
Is there any way to find out which thread was the initial one?
When there are 100s of threads, I use the following technique to look through them:
(gdb) shell rm gdb.txt
(gdb) set logging on # GDB output will go to gdb.txt
(gdb) thread apply all where
Now load gdb.txt
into your editor or pager of choice, look for main
, etc.
Upvotes: 2