Sourabh
Sourabh

Reputation: 1765

curl unable to download webpages

I am trying to open homepages of websites and extract title and description from it's html markup using curl with php, I am successful in doing this to an extent, but many websites are there I am unable to open. My code is here:

function curl_download($Url){
     if (!function_exists('curl_init')){
        die('Sorry cURL is not installed!');
    }
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $Url); 
    curl_setopt($ch, CURLOPT_HEADER, 1);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); 
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($ch, CURLOPT_TIMEOUT, 10);
    $output = curl_exec($ch);
    curl_close($ch); 
    return $output;
}
// $url is any url
$source=curl_download($url);
$d=new DOMDocument();
$d->loadHTML($source);
$title=$d->getElementsByTagName("title")->item(0)->textContent)
$domx = new DOMXPath($d);
$desc=$domx->query("//meta[@name='description']")->item(0);
$description=$desc->getAttribute('content');
?>

This code is working fine for most websites but there are many whome it doesn't even able to open. What can be the reason?

When I tried getting headers of those websites using get_headers function, its working fine, but these are not being opened using curl. Two of these websites are blogger.com and live.com.

Upvotes: 2

Views: 731

Answers (1)

Ross Smith II
Ross Smith II

Reputation: 12189

Replace:

$output = curl_exec($ch);

with

curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0); 
curl_setopt($ch, CURLOPT_SSLVERSION, 3);
$output = curl_exec($ch);
if (!$output) {
   echo curl_error($ch);
}

and see why Curl is failing.

It's a good idea to always check the result of function calls to see if they succeeded or not, and to report when they fail. While a function may work 99.999% of the time, you need to report the times it fails, and why, so the underlying cause can be identified and fixed, if possible.

Upvotes: 3

Related Questions