Reputation: 1399
I have a web page that sends a request to a cgi script that basically has to tar and compress huge directories that might be over 50G. I am forking a process in my cgi script that does the tarring up task while I send a response msg to the web page in my parent script. However, the parent script (even in the absence of waitpid($pid, 0) waits for the tar process to be over and then sends the msg to the web page which is then printed via an alert box. Is there a way I can send the msg right away so that the user knows that the tarring up process has been initiated and that process can go on in the background. I don't want to do an exec() as I need to catch any errors while creating this tar ball.
Also, will there be any memory issues while I try to tar such big directories?
$SIG{CHLD} = 'IGNORE';
$|++;
$pid = fork();
if($pid){ sendResponse($data); exit; }
elsif($pid == 0)
{
setsid();
$gpid = fork();
if(! $gpid)
{
qx(tar up big directories over 50G...);
}
}
Upvotes: 1
Views: 1302
Reputation: 161
Your initial "status" response to the client is likely being buffered somewhere. If the CGI executor is designed properly, an unbuffered output stream or request to flush a stream should result in all packets being sent to the client (where they might get buffered by whatever is receiving, but that is beyond your control).
See the Perl FAQ How do I flush/unbuffer an output filehandle? Why must I do this? for how to not buffer your initial response.
Please note that the CGI environment is also likely calling waitpid() or similar, which is why you aren't seeing results until after the child process exists. If you want a child process to persist after the request or client has gone away, you will need to disassociate the child process from the parent. See Complete Dissociation of Child from Parent in perlipc.
Since you will be dealing with extremely large amounts of data (50G), this will take quite some time. It is very likely the client or server may hit some kind of timeout while waiting for tar to finish. To avoid this, you can ensure both client and server have insanely long timeouts, or periodically send data between the two to keep the connection/request alive. You may want to consider having the tar process disassociate and having the client periodically poll the server for the result.
Finally, assuming tar is not writing to an in-memory buffer, it should be relatively memory efficient (assuming modern hardware). However, it could consume a lot of CPU and I/O. I would definitely be cautious about the security of the service and potential for (inadvertent) denial of service.
Upvotes: 2