Reputation: 3875
how to detect favicon (shortcut icon) for any site via php ?
i cant write regexp because is different in sites..
Upvotes: 5
Views: 4568
Reputation: 11
$address = 'http://www.youtube.com/'
$domain = parse_url($address, PHP_URL_HOST);
or from a database
$domain = parse_url($row['address_column'], PHP_URL_HOST);
display with
<image src="http://www.google.com/s2/favicons?domain='.$domain.'" />
Upvotes: 1
Reputation: 490233
You can request http://domain.com/favicon.ico
with PHP and see if you get a 404.
If you get a 404 there, you can pass the website's DOM, looking for a different location as referenced in the head
element by the link
element with rel="icon"
.
// Helper function to see if a url returns `200 OK`.
function $resourceExists($url) {
$headers = get_headers($request);
if ( ! $headers) {
return FALSE;
}
return (strpos($headers[0], '200') !== FALSE);
}
function domainHasFavicon($domain) {
// In case they pass 'http://example.com/'.
$request = rtrim($domain, '/') . '/favicon.ico';
// Check if the favicon.ico is where it usually is.
if (resourceExists($request)) {
return TRUE;
} else {
// If not, we'll parse the DOM and find it
$dom = new DOMDocument;
$dom->loadHTML($domain);
// Get all `link` elements that are children of `head`
$linkElements = $dom
->getElementsByTagName('head')
->item(0)
->getElementsByTagName('link');
foreach($linkElements as $element) {
if ( ! $element->hasAttribute('rel')) {
continue;
}
// Split the rel up on whitespace separated because it can have `shortcut icon`.
$rel = preg_split('/\s+/', $element->getAttribute('rel'));
if (in_array('link', $rel)) {
$href = $element->getAttribute('href');
// This may be a relative URL.
// Let's assume http, port 80 and Apache
$url = 'http://' . $_SERVER['SERVER_NAME'] . $_SERVER['REQUEST_URI'];
if (substr($href, 0, strlen($url)) !== $url) {
$href = $url . $href;
}
return resourceExists($href);
}
}
return FALSE;
}
If you want the URL returned to the favicon.ico
, it is trivial to modify the above function.
Upvotes: 2
Reputation: 7839
You could use this address and drop this into a regexp
http://www.google.com/s2/favicons?domain=www.example.com
This addresses the problem you were having with Regexp and the different results per domain
Upvotes: 15