Honey
Honey

Reputation: 29

Increase speed of my script

I have a script which takes a some.txt file and reads the links and return if my websites backlink is there or not. But the problem is, it is very slow and I want to increase its speed. Is there any way to increase its speed?

<?php
ini_set('max_execution_time', 3000);
$source = file_get_contents("your-backlinks.txt");
$needle = "http://www.submitage.com";   //without http as I have imploded the http later in the script
$new = explode("\n",$source);
foreach ($new as $check) {
$a = file_get_contents(trim($check));
if (strpos($a,$needle)) {
$found[] = $check;
     } else {
     $notfound[] = $check;
            }
                        }
echo "Matches that were found: \n ".implode("\n",$found)."\n";
echo "Matches that were not found \n". implode("\n",$notfound);
?>

Upvotes: 1

Views: 291

Answers (2)

John Dvorak
John Dvorak

Reputation: 27277

Your biggest bottleneck is the fact that you are executing the HTTP requests in sequence, not in parallel. curl is able to perform multiple requests in parallel. Here's an example from the documentation, heavily adapted to use a loop and actually collect the results. I cannot promise it's correct, I only promise I've followed the documentation correctly:

$mh = curl_multi_init();
$handles = array();

foreach($new as $check){
  $ch = curl_init();
  curl_setopt($ch, CURLOPT_URL, $check);
  curl_setopt($ch, CURLOPT_HEADER, 0);
  curl_multi_add_handle($mh,$ch);
  $handles[$check]=$ch;
}

// verbatim from the demo
$active = null;
//execute the handles
do {
    $mrc = curl_multi_exec($mh, $active);
} while ($mrc == CURLM_CALL_MULTI_PERFORM);

while ($active && $mrc == CURLM_OK) {
    if (curl_multi_select($mh) != -1) {
        do {
            $mrc = curl_multi_exec($mh, $active);
        } while ($mrc == CURLM_CALL_MULTI_PERFORM);
    }
}
// end of verbatim code

for($handles as $check => $ch){
  $a = curl_multi_getcontent($ch)
  ...
}

Upvotes: 2

Kaivosukeltaja
Kaivosukeltaja

Reputation: 15735

You won't be able to squeeze any more speed out of the operation by optimizing the PHP, except maybe some faux-multithreading solution.

However, you could create a queue system that would allow you to run the check as a background task. Instead of checking the URLs as you iterate through them, add them to the queue instead. Then write a cron script that grabs unchecked URLs from the queue one by one, checks if they contain a reference to your domain and saves the result.

Upvotes: 0

Related Questions