Reputation: 706
I have a problem in topic of posix processes and I can't get around.
I have a process which forks several children (the process tree can be complex, not only one level). It also keeps track of the active childrens' PID. At some point the parent receives a signal (SIGINT, let's say).
In the signal handler for SIGINT, it iterates over the list of child processes and sends the same signal to them in order to prevent zombies. Now, the problem is that
Parent and children have the same signal handler, as it's installed before forking. Here is a pseudocode.
signal_handler( signal )
foreach child in children
kill( child, signal )
waitpid( child, status )
// Releasing system resources, etc.
clean_up()
// Restore signal handlers.
set_signal_handlers_to_default()
// Send back the expected "I exited after XY signal" to the parent by
// executing the default signal handler again.
kill( getpid(), signal )
With this implementation the execution stops on the waitpid. If I remove the waitpid, the children keep running.
My guess is that unless a signal handler has ended, the signals sent from it are not dispatched to the children. But why aren't they dispatched if I omit wait?
Thanks a lot in advance!
Upvotes: 2
Views: 7430
Reputation: 16898
What you describe should work and indeed it does, with the following testcase:
#include <stdio.h>
#include <unistd.h>
#include <signal.h>
#define NCHILDREN 3
pid_t child [NCHILDREN];
struct sigaction sa, old;
static void
handler (int ignore)
{
int i;
/* Kill the children. */
for (i = 0; i < NCHILDREN; ++i)
{
if (child [i] > 0)
{
kill (child [i], SIGUSR1);
waitpid (child [i], 0, 0);
}
}
/* Restore the default handler. */
sigaction (SIGUSR1, &old, 0);
/* Kill self. */
kill (getpid (), SIGUSR1);
}
int
main ()
{
int i;
/* Install the signal handler. */
sa.sa_handler = handler;
sigemptyset (&sa.sa_mask);
sa.sa_flags = 0;
sigaction (SIGUSR1, &sa, &old);
/* Spawn the children. */
for (i = 0; i < NCHILDREN; ++i)
{
if ((child [i] = fork ()) == 0)
{
/* Each of the children: clear the array, wait for a signal
and exit. */
while (i >= 0)
child [i--] = -1;
pause ();
return 0;
}
}
/* Wait to be interrupted by a signal. */
pause ();
return 0;
}
If you see the parent hanging in waitpid
, it means the child has not exited. Try to attach with a debugger to see where the child is blocked, or, easier, run the program with strace(1)
. How do you clean up your pid array? Make sure the children are not trying call waitpid
with pid parameter being <= 0. Make sure the children are not blocking or ignoring the signal.
Upvotes: 6