Raphael Jeger
Raphael Jeger

Reputation: 5232

Get http-statuscode without body using cURL?

I want to parse a lot of URLs to only get their status codes.

So what I did is:

$handle = curl_init($url -> loc);
             curl_setopt($handle, CURLOPT_RETURNTRANSFER, true);
             curl_setopt($handle, CURLOPT_HEADER  , true);  // we want headers
             curl_setopt($handle, CURLOPT_NOBODY  , true);
             curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
             $response = curl_exec($handle);
             $httpCode = curl_getinfo($handle, CURLINFO_HTTP_CODE);
             curl_close($handle);

But as soon as the "nobody"-option is set to true, the returned status codes are incorrect (google.com returns 302, other sites return 303).

Setting this option to false is not possible because of the performance loss.

Any ideas?

Upvotes: 2

Views: 1995

Answers (3)

Joyce Babu
Joyce Babu

Reputation: 20654

The default HTTP request method for curl is GET. If you want only the response headers, you can use the HTTP method HEAD.

curl_setopt($handle, CURLOPT_CUSTOMREQUEST, 'HEAD');

According to @Dai's answer, the NOBODY is already using the HEAD method. So the above method will not work.

Another option would be to use fsockopen to open a connection, write the headers using fwrite. Read the response using fgets until the first occurrence of \r\n\r\n to get the complete header. Since you need only the status code, you just need to read the first 13 characters.

<?php
$fp = fsockopen("www.google.com", 80, $errno, $errstr, 30);
if ($fp) {
    $out = "GET / HTTP/1.1\r\n";
    $out .= "Host: www.google.com\r\n";
    $out .= "Accept-Encoding: gzip, deflate, sdch\r\n";
    $out .= "Accept-Language: en-GB,en-US;q=0.8,en;q=0.6\r\n";
    $out .= "User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.71 Safari/537.36\r\n";
    $out .= "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8\r\n";
    $out .= "Connection: Close\r\n\r\n";
    fwrite($fp, $out);
    $tmp = explode(' ', fgets($fp, 13));
    echo $tmp[1];
    fclose($fp);
}

Upvotes: 2

Dai
Dai

Reputation: 155205

cURL's nobody option has it use the HEAD HTTP verb, I'd wager the majority of non-static web applications I the wild don't handle this verb correctly, hence the problems you're seeing with different results. I suggest making a normal GET request and discarding the response.

Upvotes: 1

user557846
user557846

Reputation:

i suggest get_headers() instead:

<?php
$url = 'http://www.example.com';

print_r(get_headers($url));

print_r(get_headers($url, 1));
?>

Upvotes: 0

Related Questions