Jim
Jim

Reputation: 1440

Fetch og:image by file_get_contents and preg_match

I'm Using file_get_contents to get og:image from any url.

$fooURL = file_get_contents($URLVF['url']);


And then I filter property=og:image to get the image from the page and this code below work with most of the websites

preg_match("/content='(.*?)' property='og:image'/", $fooURL, $fooImage);


But sites like www.howcast.com have deffrent code of og:image like below

<meta content='http://attachments-mothership-production.s3.amazonaws.com/images/main-avatar.jpeg' property='og:image'>


So to get the image link for above code I need the preg_match to be like this

preg_match('/property="og:image" content="(.*?)"/', $fooURL, $fooImage);


But of course if I used the code above now the only site will work is howcast and every site else will return nothing

Any idea how can I make the code work with any kind of method the meta code is written or any alternative way to get the image link smoothly

Upvotes: 1

Views: 2448

Answers (1)

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89567

An example with DOMDocument and XPath as @str suggests it:

$html = <<<LOD
<html><head>
<meta content='http://attachments-mothership-production.s3.amazonaws.com/images/main-avatar.jpeg' property='og:image'>
</head><body></body></html>
LOD;

$doc = new DOMDocument();
@$doc->loadHTML($html);
// or @$doc->loadHTMLFile($URLVF['url']);
$xpath = new DOMXPath($doc);
$metaContentAttributeNodes = $xpath->query("/html/head/meta[@property='og:image']/@content");
foreach($metaContentAttributeNodes as $metaContentAttributeNode) {
    echo $metaContentAttributeNode->nodeValue . "<br/>";
}

Upvotes: 2

Related Questions