giò
giò

Reputation: 3580

Requesting page returns 403 Bad Behavior

I wrote a small script to verify if an url exists. I am using get_headers to retrieve the headers. The issue is that with some url, example this one: https://forum.obviousidea.com the response is 403 Bad Behavior, while if i open the page with browser it works.

Example output:

$headers = get_headers(https://forum.obviousidea.com);
print_r($headers);

(
    [0] => HTTP/1.1 403 Bad Behavior
    [Server] => nginx/1.6.2
    [Date] => Tue, 04 Jun 2019 21:56:27 GMT
    [Content-Type] => text/html; charset=ISO-8859-1
    [Content-Length] => 913
    [Connection] => close
    [Set-Cookie] => Array
        (
            [0] => bb_lastvisit=1559685385; expires=Wed, 03-Jun-2020 21:56:25 GMT; Max-Age=31536000; path=/; secure
            [1] => bb_lastactivity=0; expires=Wed, 03-Jun-2020 21:56:25 GMT; Max-Age=31536000; path=/; secure
            [2] => PHPSESSID=cqtkdcfpm0k2s8hl4cup6epa37; path=/
        )

    [Expires] => Thu, 19 Nov 1981 08:52:00 GMT
    [Cache-Control] => private
    [Pragma] => private
    [Status] => 403 Bad Behavior
)

How can I get the right status code using get_headers ?

Note using the user agent suggested in the answer now this website works.

But for example this url still doesn't work: https://filezilla-project.org/download.php?type=client

Upvotes: 2

Views: 324

Answers (1)

Mehdi Daalvand
Mehdi Daalvand

Reputation: 661

You may have changed the UserAgent header in php.ini or by ini_set

check it or set UserAgent like like the example below

ini_set('user_agent', '');
$headers = get_headers('https://forum.obviousidea.com');

I prefer use bellow curl function:

 /**
 * @param string $url
 * @param array  $headers
 * @return array
 * @throws Exception
 */
function curlGetHeaders(string $url, array $headers = [])
{
    $url = trim($url);
    if (!filter_var($url, FILTER_VALIDATE_URL)) {
        throw new Exception("$url is not a valid URL", 422);
    }
    $url  = explode('?', $url);
    $curl = curl_init();
    curl_setopt_array($curl, [
        CURLOPT_URL            => $url[0],
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_NOBODY         => true,
        CURLOPT_HEADER         => true,
        CURLOPT_HTTP_VERSION   => CURL_HTTP_VERSION_1_1,
        CURLOPT_CUSTOMREQUEST  => "GET",
    ]);

    if (isset($url[1])) {
        curl_setopt($curl, CURLOPT_POSTFIELDS, $url[0]);
    }

    if (!empty($headers)) {
        foreach($headers as $key => $header) {
            $curlHeaders[] = "$key:$header";
        }
        curl_setopt($curl, CURLOPT_HTTPHEADER, $curlHeaders);
    }


    $response     = rtrim(curl_exec($curl));
    $responseCode = curl_getinfo($curl, CURLINFO_RESPONSE_CODE);
    curl_error($curl);
    curl_close($curl);
    $headers                  = [];
    $data                     = explode("\r\n", $response);
    $headers['Response-Code'] = $responseCode;
    $headers['Status']        = $data[0];
    array_shift($data);
    foreach($data as $part) {
        $middle = explode(":", $part, 2);
        if (!isset($middle[1])) {
            $middle[1] = null;
        }
        $headers[trim($middle[0])] = trim($middle[1]);
    }

    return $headers;
}

Upvotes: 1

Related Questions