Niranjan
Niranjan

Reputation: 111

Debugging utilities for Linux process hang issues?

I have a daemon process which does the configuration management. all the other processes should interact with this daemon for their functioning. But when I execute a large action, after few hours the daemon process is unresponsive for 2 to 3 hours. And After 2- 3 hours it is working normally.

Debugging utilities for Linux process hang issues?

How to get at what point the linux process hangs?

Upvotes: 8

Views: 22713

Answers (3)

M-Razavi
M-Razavi

Reputation: 3467

There are a number of different ways to do:

  1. Listening on a UNIX domain socket, to handle status requests. An external application can then inquire as to whether the application is still ok. If it gets no response within some timeout period, then it can be assumed that the application being queried has deadlocked or is dead.

  2. Periodically touching a file with a preselected path. An external application can look a the timestamp for the file, and if it is stale, then it can assume that the appliation is dead or deadlocked.

  3. You can use the alarm syscall repeatedly, having the signal terminate the process (use sigaction accordingly). As long as you keep calling alarm (i.e. as long as your program is running) it will keep running. Once you don't, the signal will fire.

You can seamlessly restart your process as it dies with fork and waitpid as described in this answer. It does not cost any significant resources, since the OS will share the memory pages.

Upvotes: 1

Peter Tillemans
Peter Tillemans

Reputation: 35341

  • strace can show the last system calls and their result
  • lsof can show open files
  • the system log can be very effective when log messages are written to track progress. Allows to box the problem in smaller areas. Also correlate log messages to other messages from other systems, this often turns up interesting results
  • wireshark if the apps use sockets to make the wire chatter visible.
  • ps ax + top can show if your app is in a busy loop, i.e. running all the time, sleeping or blocked in IO, consuming CPU, using memory.

Each of these may give a little bit of information which together build up a picture of the issue.

When using gdb, it might be useful to trigger a core dump when the app is blocked. Then you have a static snapshot which you can analyze using post mortem debugging at your leisure. You can have these triggered by a script. The you quickly build up a set of snapshots which can be used to test your theories.

Upvotes: 11

Yann Ramin
Yann Ramin

Reputation: 33187

One option is to use gdb and use the attach command in order to attach to a running process. You will need to load a file containing the symbols of the executable in question (using the file command)

Upvotes: 1

Related Questions