Reputation: 639
I have a Perl app that's been running largely untroubled on a RH system for a few years. In one place, I have to run a system command that can take take many minutes to complete, so I do this in a child process. The overall structure is like this:
$SIG{CHLD} = 'IGNORE'; # Ignore dead children, to avoid zombie processes
my $child = fork();
if ($child) { # Parent; return OK
$self->status_ok();
} else { # Child; run system command
# do a bunch of data retrieval, etc.
my $output;
my @command = # generate system command here
use IPC::System::Simple 'capture';
eval { $output = capture(@command); };
$self->log->error("Error running @command: $@") if $@;
# success: log $output, carry on
}
We recently changed some of our infrastructure, although not in ways that I expected would have any influence on this. (Still running on RH, still using nginx, etc.) However, now we find that almost every instance of running this code fails, logging 'Error running {command}: failed to start: "No child processes" at /path/to/code.pl'.
I've looked around and can't figure out what the right solution is for this. There was a suggestion to change $SIG{CHLD}
from 'IGNORE' to 'DEFAULT', but then I have to worry about zombie processes.
What is causing the "No child processes" error, and how do we fix this?
Upvotes: 4
Views: 1288
Reputation: 385565
There was a suggestion to change $SIG{CHLD} from 'IGNORE' to 'DEFAULT', but then I have to worry about zombie processes.
This isn't true.
A zombie process is a process that has ended, but hasn't been reaped by its parent yet. A parent reaps its children using wait
(2), waitpid
(2) or similar. capture
waits for its child to end, so it doesn't leave any zombie behind.
In fact, the error you are getting is from waitpid
. capture
is waiting for the child to end to reap it and collect its error code, but the you instructed the OS to clean up the child as soon as it completes, leaving waitpid
with no child to reap and no error code to collect.
To fix this problem, simply place local $SIG{CHLD} = 'DEFAULT';
before the call to capture
.
Upvotes: 2