Rehmat
Rehmat

Reputation: 5071

URL Validation in PHP

The topic has been discussed a lot here at StackOverflow but all the answers I managed to explore fail to produce the results I need. I want to check before inserting the URL into database that the value is actually a URL. The default function of PHP FILTER_VALIDATE_URL returns true even if we just provide httpp://exampl

but I need to validate the value only if it is a true domain like example.net, example.com etc.. Let's try an example:

Case 1:

$url = "http://example";
if(!filter_var($url, FILTER_VALIDATE_URL) === false) {
                return true;
            }

This above returns true but domain isn't valid.

Case 2:

$url = "http://google.com";
if(!filter_var($url, FILTER_VALIDATE_URL) === false) {
                return true;
            }

Returns true and that's okay.

But any possible solution for case 1? Please help.

P.S.: I used CURL and it works but the response is too slow (more than 5 seconds). Any solid solution will be greatly appreciated.

Upvotes: 1

Views: 1737

Answers (2)

Pedro Lobito
Pedro Lobito

Reputation: 98921

I've coded a quick script that may help you achieving what you need :

<?php
//error_reporting(E_ALL);
//ini_set('display_errors', 1);
$url = "http://www.google.com";


if(validateUrl($url)){
    echo "VALID";
}else{
    echo "INVALID";
}

function validateUrl($url){

//first we validate the url using a regex

if (!preg_match('%^(?:(?:https?)://)(?:\S+(?::\S*)?@)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\x{00a1}-\x{ffff}0-9]-*)*[a-z\x{00a1}-\x{ffff}0-9]+)(?:\.(?:[a-z\x{00a1}-\x{ffff}0-9]-*)*[a-z\x{00a1}-\x{ffff}0-9]+)*(?:\.(?:[a-z\x{00a1}-\x{ffff}]{2,}))\.?)(?::\d{2,5})?(?:[/?#]\S*)?$%uiS', $url)) {

    return false;
}


//if the url is valid, we "curl it" and expect to get a 200 header response in order to validate it.

$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HEADER, true);    // we want headers
curl_setopt($ch, CURLOPT_NOBODY, true);    // we don't need body (faster)
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION,1); // we follow redirections
curl_setopt($ch, CURLOPT_TIMEOUT,10);
$output = curl_exec($ch);
$httpcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);


if($httpcode == "200"){
    return true;
}else{
    return false;
}


}

Upvotes: 3

Toby Allen
Toby Allen

Reputation: 11213

http://example is a valid url - if you have a computer called example on your local network.

The only solution for what you want (especially considering that there are lots of new top level domains) is to connect and see if you get 200 OK.

CURL is probably the best solution here.

This superuser question might help to just get the response code from a url.

However you will never get 100% accuracy

Upvotes: 1

Related Questions