Reputation: 3580
I wrote a small script to verify if an url exists. I am using get_headers to retrieve the headers. The issue is that with some url, example this one: https://forum.obviousidea.com the response is 403 Bad Behavior
, while if i open the page with browser it works.
Example output:
$headers = get_headers(https://forum.obviousidea.com);
print_r($headers);
(
[0] => HTTP/1.1 403 Bad Behavior
[Server] => nginx/1.6.2
[Date] => Tue, 04 Jun 2019 21:56:27 GMT
[Content-Type] => text/html; charset=ISO-8859-1
[Content-Length] => 913
[Connection] => close
[Set-Cookie] => Array
(
[0] => bb_lastvisit=1559685385; expires=Wed, 03-Jun-2020 21:56:25 GMT; Max-Age=31536000; path=/; secure
[1] => bb_lastactivity=0; expires=Wed, 03-Jun-2020 21:56:25 GMT; Max-Age=31536000; path=/; secure
[2] => PHPSESSID=cqtkdcfpm0k2s8hl4cup6epa37; path=/
)
[Expires] => Thu, 19 Nov 1981 08:52:00 GMT
[Cache-Control] => private
[Pragma] => private
[Status] => 403 Bad Behavior
)
How can I get the right status code using get_headers ?
Note using the user agent suggested in the answer now this website works.
But for example this url still doesn't work: https://filezilla-project.org/download.php?type=client
Upvotes: 2
Views: 324
Reputation: 661
You may have changed the UserAgent header in php.ini or by ini_set
check it or set UserAgent like like the example below
ini_set('user_agent', '');
$headers = get_headers('https://forum.obviousidea.com');
I prefer use bellow curl function:
/**
* @param string $url
* @param array $headers
* @return array
* @throws Exception
*/
function curlGetHeaders(string $url, array $headers = [])
{
$url = trim($url);
if (!filter_var($url, FILTER_VALIDATE_URL)) {
throw new Exception("$url is not a valid URL", 422);
}
$url = explode('?', $url);
$curl = curl_init();
curl_setopt_array($curl, [
CURLOPT_URL => $url[0],
CURLOPT_RETURNTRANSFER => true,
CURLOPT_NOBODY => true,
CURLOPT_HEADER => true,
CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
CURLOPT_CUSTOMREQUEST => "GET",
]);
if (isset($url[1])) {
curl_setopt($curl, CURLOPT_POSTFIELDS, $url[0]);
}
if (!empty($headers)) {
foreach($headers as $key => $header) {
$curlHeaders[] = "$key:$header";
}
curl_setopt($curl, CURLOPT_HTTPHEADER, $curlHeaders);
}
$response = rtrim(curl_exec($curl));
$responseCode = curl_getinfo($curl, CURLINFO_RESPONSE_CODE);
curl_error($curl);
curl_close($curl);
$headers = [];
$data = explode("\r\n", $response);
$headers['Response-Code'] = $responseCode;
$headers['Status'] = $data[0];
array_shift($data);
foreach($data as $part) {
$middle = explode(":", $part, 2);
if (!isset($middle[1])) {
$middle[1] = null;
}
$headers[trim($middle[0])] = trim($middle[1]);
}
return $headers;
}
Upvotes: 1