Reputation: 639
I am downloading a big XML file on PHP through curl (around 43MB), then processing that file, validating data and inserting into the database. The problem is that the load balancer stops sending a response to the user after 5 minutes and the PHP script takes approximately 20 minutes. I thought of running two PHP scripts in parallel. One creates an empty file on the server, downloads and process the xml file and at the end deletes the empty file. The other PHP script runs every 15 seconds and checks if the empty file still exits. I am having trouble in running these two scripts in parallel. This is my code:
$(document).ready(function() {
$(document).on("click", ".clickMe", function () {
var download = $.ajax({
async: true,
url: "/staff/import.php",
type: "post",
data: { getFile: true },
dataType: "json",
success: function (data) {
}
});
var serverStatus = true;
while (serverStatus === true) {
var checkDownload = $.ajax({
async: false,
url: "/staff/checkDownload.php",
type: "post",
dataType: "json",
data: { checkDownload: true },
success: function (returndata) {
if (returndata === false) {
serverStatus = false;
}
}
});
}
});
});
PHP Curl Download:
<?php
session_write_close();
touch(getcwd() . "downloading");
$curl = curl_init(); //
$post = array("uploadfile"=>"@" . getcwd() . "/tmp.xml");
curl_setopt($curl, CURLOPT_URL, "sftp://<host>/bigFile.xml");
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($curl, CURLOPT_USERAGENT,"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)");
curl_setopt($curl, CURLOPT_POST, true);
curl_setopt($curl, CURLOPT_POSTFIELDS, $post);
curl_setopt($curl, CURLOPT_PROTOCOLS, CURLPROTO_SFTP);
curl_setopt($curl, CURLOPT_USERPWD, "<userName>:<password>");
file_put_contents(getcwd() . "/tmp.xml", curl_exec($curl));
curl_close($curl);
unlink(getcwd() . "downloading");
// process xml file
// insert into database
PHP check download:
<?php
$return = file_exists(getcwd() . "/tmp/downloading");
echo json_encode(true);
In order to make the two PHP scripts running in parallel I read that I need to disable the sessions [session_write_close();]
but still one waits until the other finishes. Anyone can give me any light if I am doing something wrong on my code (Javascript or PHP) or know of any other approach to do this?
Thanks
Upvotes: 0
Views: 183
Reputation: 73211
I would prefer a solution based on database
and nohup php
If I get you right, your "bigger script" is the curl request. So take this script
touch(getcwd() . "downloading");
$curl = curl_init(); //
$post = array("uploadfile"=>"@" . getcwd() . "/tmp.xml");
curl_setopt($curl, CURLOPT_URL, "sftp://<host>/bigFile.xml);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($curl, CURLOPT_USERAGENT,"Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)");
curl_setopt($curl, CURLOPT_POST, true);
curl_setopt($curl, CURLOPT_POSTFIELDS, $post);
curl_setopt($curl, CURLOPT_PROTOCOLS, CURLPROTO_SFTP);
curl_setopt($curl, CURLOPT_USERPWD, "<userName>:<password>");
file_put_contents(getcwd() . "/tmp.xml", curl_exec($curl));
curl_close($curl);
unlink(getcwd() . "downloading");
// process xml file
// insert into database
SIDENOTE:
you need to set full paths to all of the called files when running on command line
and add a insert statement at the beginning of the file and insert this to your database like
INSERT INTO checkrun (started, done) VALUES (NOW(), 0);
after curl_close(), add a statement like
UPDATE checkrun set done = 1 WHERE id = (SELECT max(id) FROM checkrun);
Now safe this to another file and put it to a folder anywhere on your server, let's take /user/curlfile/curlrequest.php
.
Your first ajax call will now go to a file in your webroot, this file should contain an exec
statement, like this:
exec('nohup php /user/curlfile/curlrequest.php');
your file is now running and set to the background, means it doesn't affect the rest of your work (at least way less). nohup
usually creates a logfile, if you don't want this, add
>/dev/null 2>&1 &
after your nohup command.
With your second ajax call, you can run a script that simply checks the table checkrun for the max(id)
and, if done = 0
it has to continue, if done = 1
your curl request is done and you can do whatever you want. I would work with a
setTimeout();
function to check every 15 seconds, or whatever time you want.
NOTE:
This kind of database check will work only if you have one curl request at a time, if you have more I'd create a random string in your page that does the ajax calls and send this string to your files. You can do this by using getopt:
$options = getopt("f:");
var_dump($options);
in your curl file and run your exec command like
exec('nohup php /user/curlfile/curlrequest.php -f "randomString"');
Now you can simply check for done
WHERE requestId = randomString
I hope I didn't forget something, but this should do the job with less pain possible.
Upvotes: 3