Reputation: 7864
In java debugging a hung application is easy. You can take the memory dump of the application and use and use eclipse jvm dump analyser to see the status of the threads and where each threads were blocked?
Does something like this exists for C++?
Upvotes: 4
Views: 18488
Reputation: 1014
We can use below gdb commands to debug deadlock
Attach to a running process which is in hung/deadlock state using below command
gdb -p <PID>
Once you have attached to that process you can see all the LWP using below command
(gdb) info threads
Id Target Id Frame
16 Thread 0xfff06111f0 (LWP 2791) "abc.d" 0x000000fff0f0104c in select () from /lib64/libc.so.6
15 Thread 0xffefdf01f0 (LWP 2792) "abc.d" 0x000000fff0f0104c in select () from /lib64/libc.so.6
14 Thread 0xffef5bb1f0 (LWP 2793) "abc.d" 0x000000fff26feb4c in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
13 Thread 0xffeed351f0 (LWP 2794) "abc.d" 0x000000fff2703924 in nanosleep () from /lib64/libpthread.so.0
12 Thread 0xffee5351f0 (LWP 2795) "abc.d" 0x000000fff26fe76c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
11 Thread 0xffec8a71f0 (LWP 2796) "abc.d" 0x000000fff26fe76c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
10 Thread 0xffd7cd11f0 (LWP 2797) "abc.d" 0x000000fff0f0104c in select () from /lib64/libc.so.6
9 Thread 0xffd74d11f0 (LWP 2798) "abc.d" 0x000000fff0f0104c in select () from /lib64/libc.so.6
8 Thread 0xffd6cd11f0 (LWP 2801) "abc.d" 0x000000fff27022f4 in __lll_lock_wait () from /lib64/libpthread.so.0
7 Thread 0xffd64d11f0 (LWP 2802) "abc.d" 0x000000fff0f0104c in select () from /lib64/libc.so.6
6 Thread 0xffd5cd11f0 (LWP 2803) "abc.d" 0x000000fff0f0104c in select () from /lib64/libc.so.6
5 Thread 0xffd54d11f0 (LWP 2804) "abc.d" 0x000000fff0f0104c in select () from /lib64/libc.so.6
4 Thread 0xffd4cd11f0 (LWP 2805) "abc.d" 0x000000fff0f0104c in select () from /lib64/libc.so.6
3 Thread 0xffc7fff1f0 (LWP 2928) "abc.d" 0x000000fff0f0104c in select () from /lib64/libc.so.6
2 Thread 0xffc77ff1f0 (LWP 2929) "abc.d" 0x000000fff0f0104c in select () from /lib64/libc.so.6
1 Thread 0xfff0a62000 (LWP 2744 for) "abc.d" 0x000000fff0f19b9c in __lll_lock_wait_private () from /lib64/libc.so.6
We can see thread 1 and thread 8 are in waiting state we can go to each of the thread as below
(gdb) thread 1
(gdb) bt
The output of the above command will be as below:
(gdb) thread 1
[Switching to thread 1 (Thread 0xfff0a62000 (LWP 2744))]
0 0x000000fff0f19b9c in __lll_lock_wait_private () from /lib64/libc.so.6
(gdb) bt
0 0x000000fff0f19b9c in __lll_lock_wait_private () from
/lib64/libc.so.6
1 0x000000fff0ea3238 in malloc () from /lib64/libc.so.6
2 0x000000fff115df0c in operator new(unsigned long) () from
/lib64/libstdc++.so.6
3 0x000000fff11ceddc in std::string::_Rep::_S_create(unsigned long,
unsigned long, std::allocator<char> const&) () from
/lib64/libstdc++.so.6
4 0x000000fff11d165c in char* std::string::_S_construct<char
const*>(char const*, char const*, std::allocator<char> const&,
std::forward_iterator_tag) () from /lib64/libstdc++.so.6
5 0x000000fff11d1760 in std::basic_string<char,
std::char_traits<char>, std::allocator<char> >::basic_string(char
const*, std::allocator<char> const&) () from /lib64/libstdc++.so.6
6 0x000000fff1eeac1c in getTime() () from
/usr/sbin/dir/sharedobj/liblibLite.so
7 0x000000fff1eeb18c in Logging::logBegin() () from
/usr/sbin/dir/sharedobj/liblibLite.so
8 0x000000fff1f324f8 in sigsegv_handler(int, siginfo_t*, void*) ()
from /usr/sbin/dir/sharedobj/liblibLite.so
9 signal handler called
10 0x000000fff0e9f530 in malloc_consolidate () from /lib64/libc.so.6
11 0x000000fff0ea0160 in _int_free () from /lib64/libc.so.6
12 0x000000fff115b184 in operator delete(void*) () from
/lib64/libstdc++.so.6
13 0x000000fff115b1f4 in operator delete[](void*) () from
/lib64/libstdc++.so.6
14 0x000000fff20cfd60 in pstream::~pstream() () from
/usr/sbin/dir/sharedobj/libconnV2.so
15 0x000000fff208ffd8 in ifaceSocket::dispatchMsg(pstream&) () from
/usr/sbin/dir/sharedobj/libsockIf.so
16 0x000000fff207d5a4 in
socketInterface::socket_callback(ConnectionEvent, char*, int) () from
/usr/sbin/dir/sharedobj/libsockIf.so
17 0x000000fff208f43c in ifaceSocket::Callback(ConnectionEvent, char*,
int)
() from /usr/sbin/dir/sharedobj/libsockIf.so
18 0x000000fff20c4674 in ConnectionOS::ProcessReadEvent() () from
/usr/sbin/dir/sharedobj/libconnV2.so
19 0x000000fff20cc808 in ConnectionOSManager::ProcessConns(fd_set*,
fd_set*)
() from /usr/sbin/dir/sharedobj/libconnV2.so
20 0x000000fff20cf3bc in SocketsManager::ProcessFds(bool) () from
/usr/sbin/dir/sharedobj/libconnV2.so
21 0x000000fff1e54aa8 in EventReactorBase::IO() () from
/usr/sbin/dir/sharedobj/libthreadlib.so
22 0x000000fff1e5406c in EventReactorBase::React() () from
/usr/sbin/dir/sharedobj/libthreadlib.so
23 0x000000fff1e50508 in Task::Run() () from
/usr/sbin/dir/sharedobj/libthreadlib.so
24 0x000000fff1e50584 in startTask(void*) () from
/usr/sbin/dir/sharedobj/libthreadlib.so
25 0x00000000104a421c in TaskMgr::Start() ()
26 0x00000000100ddddc in main ()
(gdb) info reg From r8 field get the very first address (gdb) print *((int*)(0x0000000019ff3d30)) $1 = 2 // Locks (gdb) print *((int*)(0x0000000019ff3d30)+1) $2 = 0 // Count (gdb) print *((int*)(0x0000000019ff3d30)+2) $3 = 2744 // Owner PID
Upvotes: 5
Reputation: 439
In Windows native applications Windbg is the tool of choice for me. If possible I will debug a deadlocked process live, failing that a full process memory dump will usually get you there.
My approach is to draw a wait graph documenting the relationships between threads and resources. I usually start by running the command !locks to identify which threads are holding any critical sections in the deadlocked process.
I then start drawing the wait graph by selecting the critical section with the highest contention count (if there is a deadlock there will be a cycle in the graph so it doesn't really matter where you start). Find the owning thread and select it in the debugger (The ~ command allows you to associate thread ids with the thread numbers used by the debugger, use ~***threadnumber***s to select the thread and kbn to display its stack. If the process is deadlocked then chances are it will be performing some sort of blocking operation e.g. look for calls to RtlEnterCriticalSection or WaitForSingleObject et al. In a deadlock situation these calls usually enable you to identify another resource that is being waited for. Add this information to the wait graph and continue until you either get back to where you started.
If your wait graph crosses process boundaries you might find you need to find who owns a kernel object in another process (this is why I debug live if I can). The sysinternals Process Explorer tool is useful for this purpose.
Once you have identified the participants in a deadlock then you need to put your thinking cap on to figure out where to go next. This could mean changing the order of resource acquisition (as someone has pointed out) but really there isn't a general method it will need extra information about the design of the application to understand how to remove the cyclic dependency in the wait graph.
There are circumstances where a cycle may not be the cause of the problem for example your system maybe waiting for user input that will never come (hands up anyone who has seen a call to MessageBox for a process running as a service).
There is of course more to it than this butI hope this might set you off in the right direction.
Upvotes: 2
Reputation: 76531
The magic invocation in gdb is:
thread all apply bt
That runs the bt (backtrace) command for all threads. Unless you have completely stripped your program, you should be able to see the names of each function.
This works both for live and post-mortem (i.e. running gdb against a core) debugging.
Upvotes: 5
Reputation: 13769
Of course, strategically placed cout
statements (or other output alternatives) are always an option but often far from ideal.
If compiling with g++, compile with -g
and use gdb. You can attach to a running process (and the source code) or simply run the program in the debugger to start with. Then look at the stack.
In Windows, just pause execution of your program and look at the stack.
Upvotes: 0
Reputation: 41374
You can do the exact same thing with C++; force a core dump and look into it after.
Or, if you're using MSVC, you can simply attach the debugger to the application while it's running. Hit "break all" and poke around through the threads.
Upvotes: 5
Reputation: 74654
Upvotes: -1
Reputation: 7864
I have not done this, but I think you can use gdb to generate core of your application at the moment it is hung.
You can try debugging this core using the gdb itself and see for yourself what threads are blocked where?
The above is possible in linux platforms. Not sure, if cygwin on windows can be used for the same purpose.
Upvotes: 0