Reputation: 9
I am developing a program that is doing various tasks using fork()
. I am starting the program, everything works fine. I observed that after some time (1 day) i get flooded with <defunct>
processes, over 600 700 ... where max forks is setted to 500. This is the code :
int numforks = 0;
int maxf = 100;
// READ FROM FILE ...
while (fgets(nutt,2048,fp))
{
fflush(stdout);
if (!(fork()))
{
some_time_intensive_function();
exit(0);
}
else
{
numforks++;
if (numforks >= maxf)
{
wait(NULL);
numforks--;
}
}
}
// DON'T EXIT PROGRAM TILL ALL FORKS ARE FINISHED
while(numforks>0)
{
wait(NULL);
numforks--;
}
// CLOSE READ FILE ...
This programs keeps all the time 500 forks oped like a thread pool.
I don't really understand what <defunct>
processes are, but i heard that they aren't errors in the child processes like SEG FAULT
occurring, but rather parent process is not waiting correctly.
I want to get read of <defunct>
s, any ideas to solve this?
I repeat, this happens after some time 1-2 days.
Thank you.
Upvotes: 0
Views: 1056
Reputation: 1
(I assume you are running on Linux, or some other POSIX system like MacOSX)
Beware of orphan processes.
Read Advanced Linux Programming which has several chapters related to your issue.
You'll better keep the result of fork
(in some pid_t
variable or field), and handle all three cases (>0: fork
was successful; ==0, in child process, <0: fork
failed!). And you should probably call waitpid(2) appropriately. In the child process it is reasonable to call exit(3) (or execve(2)...)
Perhaps you should handle SIGCHLD
signal. Read carefully signal(7).
(you don't show enough of your program, and an entire book is needed to explain all that)
As a rule of thumb you don't want to have many runnable processes. On a typical laptop or desktop computer, you should not have more than a dozen of runnable processes. Use top(1) or ps(1) to list your processes (and notably to understand how many processes you have). Perhaps use (at least during debugging) bash ulimit
builtin (it calls setrlimit(2) from inside your shell) in your terminal e.g. as ulimit -u 50
to limit the number of processes (to 50).
If coding in genuine C++11, you should consider using frameworks like Qt or POCO (both provide support for processes).
You should care about inter-process communication (perhaps with pipe(7)-s or socket(7)-s and some event loop, see poll(2) ...) and synchronization issues. Perhaps look into MPI or 0mq.
(you probably need to read a lot more)
Perhaps strace(1) might be helpful to debug your issues.
Don't forget to check every system call. See syscalls(2) & errno(3).
Upvotes: 1
Reputation: 29017
I think you have two problems:
Firstly wait
can return for reasons other than a child process has terminated (and if it does, it will leave a defunct process). I think you need to pass in a non-null pointer, and inspect the returned wait status. Only decrement numforks if appropriate.
Secondlynumforks
doesn't (effectively) limit the total number of child processes. If the parent process launches two processes, they will each go on to inherit numforks
of 0 and 1. Then each of those child processes will launch 500 and 499 more subprocesses.
I think you need exit(0)
(or break
) after your time_consuming_process()
.
Upvotes: 2