Reputation: 5117
I have a function for getting an proper url: example.com to http://example.com, www.example.org to https://example.org etc.
function startsWith($haystack, $needle) {
return !strncmp($haystack, $needle, strlen($needle));
}
function properUrl($url) {
$urls = array();
if (startsWith($url, "https://") || startsWith($url, "http://")) {
$urls[] = $url;
} else if (startsWith($url, "www.")) {
$url = substr($url, 4);
$urls[] = "http://$url";
$urls[] = "http://www.$url";
$urls[] = "https://$url";
$urls[] = "https://www.$url";
} else {
$urls[] = "http://$url";
$urls[] = "http://www.$url";
$urls[] = "https://$url";
$urls[] = "https://www.$url";
}
foreach ($urls as $u) {
if (@file_get_contents($u)) {
$url = $u;
break;
}
}
return $url;
}
What is a quicker algorithm instead of file_get_contents. I've want to get a proper url, no reading an whole page. thanks.
Upvotes: 0
Views: 105
Reputation: 4529
Use php's parse_url()
http://php.net/manual/en/function.parse-url.php
Example:
<?php
$url = '//www.example.com/path?googleguy=googley';
// Prior to 5.4.7 this would show the path as "//www.example.com/path"
var_dump(parse_url($url));
?>
will give you:
array(3) {
["host"]=>
string(15) "www.example.com"
["path"]=>
string(5) "/path"
["query"]=>
string(17) "googleguy=googley"
}
while:
<?php
$url = 'http://username:password@hostname/path?arg=value#anchor';
print_r(parse_url($url));
echo parse_url($url, PHP_URL_PATH);
?>
will give you:
Array
(
[scheme] => http
[host] => hostname
[user] => username
[pass] => password
[path] => /path
[query] => arg=value
[fragment] => anchor
)
As you can see it is quite easy to just check the array's indexes for the values you require and build the rest of your url from there. Saves alot of string compare stuff..
To check if the url exists, you should just check for the headers instead of getting the entire file (which is slow). Php's get_headers()
will do that for you:
$file = 'http://www.domain.com/somefile.jpg';
$file_headers = @get_headers($file);
if($file_headers[0] == 'HTTP/1.1 404 Not Found') {
$exists = false;
} else {
$exists = true;
}
Good luck!
Upvotes: 1