user1289
user1289

Reputation: 1321

Find the main thread while debugging core file

I have a program, where main threads creates lots of threads. It crashed, and I'm debugging core file. Crash happened in one of child threads. In order to find the reason, I need to know whether the main thread is still alive. Is there any way to find out which thread was the initial one?

Upvotes: 4

Views: 1820

Answers (2)

horstr
horstr

Reputation: 2817

As a general approach for UNIX-based systems, the accepted answer works as expected.

On Linux (and OSes that chose a similar POSIX threads implementation strategy), identifying the main thread can be much more straightforward. Typically, the file name of a core dump contains the PID of the faulting process (e.g. core.<pid>) unless the core pattern (/proc/sys/kernel/core_pattern) was changed. With that, you can reliably determine the main thread using thread find <pid>:

$ gdb executable core.24533
[...]
(gdb) thread find 24533
Thread 7 has target id 'Thread 0x7f8ae2169740 (LWP 24533)'
(gdb) thread 7
[Switching to thread 7 (Thread 0x7f8ae2169740 (LWP 24533))]
#0  0x00007f8ae1d40017 in pthread_join (threadid=140234458433280, thread_return=0x0) at pthread_join.c:90
90      lll_wait_tid (pd->tid);
(gdb) bt
#0  0x00007f8ae1d40017 in pthread_join (threadid=140234458433280, thread_return=0x0) at pthread_join.c:90
#1  0x00007f8ae1ae40f7 in __gthread_join (__value_ptr=0x0, __threadid=<optimized out>)
    at /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/x86_64-redhat-linux/bits/gthr-default.h:668
#2  std::thread::join (this=this@entry=0x5595aac42990) at ../../../../../libstdc++-v3/src/c++11/thread.cc:107
#3  0x00005595a9681468 in operator() (t=..., __closure=<optimized out>) at segv.cxx:31
#4  for_each<__gnu_cxx::__normal_iterator<std::thread*, std::vector<std::thread> >, ThreadPool::wait()::__lambda1> (__last=..., __first=..., __f=...)
    at /usr/include/c++/4.8.2/bits/stl_algo.h:4417
#5  wait (this=0x7ffcac67d860) at segv.cxx:32
#6  main (argc=<optimized out>, argv=<optimized out>) at segv.cxx:75

If the file name is missing the PID, it can be recovered from the core dump itself. The PID is stored in a note section (PT_NOTE). Both, NT_PRSTATUS and NT_PRPSINFO contain the PID. In case of multiple threads, NT_PRSTATUS exists for each individual thread including the main thread and the order is unspecified, NT_PRPSINFO on the other hand exists only once.

The definition in case of Linux x86_64 (pr_pid is our field of interest):

struct elf_prpsinfo
{
        char    pr_state;       /* numeric process state */
        char    pr_sname;       /* char for pr_state */
        char    pr_zomb;        /* zombie */
        char    pr_nice;        /* nice val */
        unsigned long pr_flag;  /* flags */
        __kernel_uid_t  pr_uid;
        __kernel_gid_t  pr_gid;
        pid_t   pr_pid, pr_ppid, pr_pgrp, pr_sid;
        /* Lots missing */
        char    pr_fname[16];   /* filename of executable */
        char    pr_psargs[ELF_PRARGSZ]; /* initial part of arg list */
};

eu-readelf -n (provided by elfutils) can be used to extract the PID from NT_PRPSINFO:

$ eu-readelf -n core
[...]
  CORE                 136  PRPSINFO
    state: 2, sname: D, zomb: 0, nice: 0, flag: 0x0000000040402504
    uid: 0, gid: 0, pid: 24533, ppid: 17322, pgrp: 24533, sid: 17299
                         ^^^^^
    fname: segv, psargs: ./segv 2 
[...]

Upvotes: 2

Employed Russian
Employed Russian

Reputation: 213385

Is there any way to find out which thread was the initial one?

When there are 100s of threads, I use the following technique to look through them:

(gdb) shell rm gdb.txt
(gdb) set logging on   # GDB output will go to gdb.txt
(gdb) thread apply all where

Now load gdb.txt into your editor or pager of choice, look for main, etc.

Upvotes: 2

Related Questions