Kokox
Kokox

Reputation: 519

PHP - Check if the final URL exists

I know there are ways to verify if a URL returns a 404 or not. I have been using the following function and it has been working fine but my problem is that I want to verify a URL of a domain that redirects me to a subdomain depending on the language used by my region.

function page_404($url) {
    $handle = curl_init($url);
    curl_setopt($handle, CURLOPT_RETURNTRANSFER, TRUE);
    curl_setopt($handle, CURLOPT_SSL_VERIFYPEER, false);
    curl_setopt($handle, CURLOPT_SSL_VERIFYHOST, false);

    /* Get the HTML or whatever is linked in $url. */
    $response = curl_exec($handle);

    /* Check for 404 (file not found). */
    $httpCode = curl_getinfo($handle, CURLINFO_HTTP_CODE);
    curl_close($handle);

    /* If the document has loaded successfully without any redirection or error */
    if ($httpCode >= 200 && $httpCode < 300) {
        echo $httpCode."<br/>";
        return false;
    } else {
        echo $httpCode."<br/>";
        return true;
    }
}

For example:

https://example.com/video/123456

I'm redirected to the following URL:

https://es.example.com/video/123456

Which means that it is an http code "301" and my function detects it as redirection and therefore gives me the answer that the video does not exist, but in fact it exists only that the domain I redirected to that subdomain.

If I change the line $httpCode<300 for $httpCode<303 it work.

But the problem is that this page when it receives an invalid url redirects me to its main web so I do not receive a 404 code and it would serve me a 301 or 303.

What can I do? I hope I did well.

Upvotes: 3

Views: 2704

Answers (2)

Barmar
Barmar

Reputation: 780798

You can tell cURL to follow all redirects, and return the result from the final redirection. Use:

curl_setopt($handle, CURLOPT_FOLLOWLOCATION, true);

Upvotes: 2

Nick Coons
Nick Coons

Reputation: 3692

You would want to make this recursive, since you can redirect to a page that redirects to a page that ... well, you get the idea. And you want to know if the final page exists. And you have no idea up front how many redirects it will take to get there.

You would want a conditional after:

if ($httpCode >= 200 && $httpCode < 300) {

Something like this:

} elseif ($httpCode >= 301 && $httpCode <= 302) {

(This assumes that redirect codes are 301 and 302.. there may be others that I'm not including, so adjust this accordingly). Then in here, grab the URL you're being directed to, then have the function call itself with this URL. It will do this for each redirect.

However, if you do it this way, you may want to add a second parameter so you know how many times you've called this, something like:

function page_404($url, $iteration = 1)

So when you call it later on, you do so this way:

page_404($url, $iteration + 1);

Then, at the very beginning, do a check to make sure you end up in an infinite redirect:

if($iteration > 10) {
    echo "Too many redirects";
    return (some error);
}

Most browsers will puke if they encounter a URL that redirects 10 or 15 times, so this is probably a fairly safe number, and a safe behavior. Otherwise, you could end up redirecting forever if you hit a misconfigured URL.

Upvotes: 0

Related Questions