foxybagga
foxybagga

Reputation: 4214

Extract Image Sources from text in PHP - preg_match_all required

I have a little issue as my preg_match_all is not running properly.

what I want to do is extract the src parameter of all the images in the post_content from the wordpress which is a string - not a complete html document/DOM (thus cannot use a document parser function)

I am currently using the below code which is unfortunately too untidy and works for only 1 image src, where I want all image sources from that string

preg_match_all( '/src="([^"]*)"/', $search->post_content, $matches);

if ( isset( $matches ) )
{  

foreach ($matches as $match) 
{

if(strpos($match[0], "src")!==false)
{
$res = explode("\"", $match[0]);
echo $res[1];
}

}

}

can someone please help here...

Upvotes: 2

Views: 969

Answers (2)

Gumbo
Gumbo

Reputation: 655139

Using regular expressions to parse an HTML document can be very error prone. Like in your case where not only IMG elements have an SRC attribute (in fact, that doesn’t even need to be an HTML attribute at all). Besides that, it also might be possible that the attribute value is not enclosed in double quote.

Better use a HTML DOM parser like PHP’s DOMDocument and its methods:

$doc = new DOMDocument();
$doc->loadHTML($search->post_content);
foreach ($doc->getElementsByTagName('img') as $img) {
    if ($img->hasAttribute('src')) {
        echo $img->getAttribute('src');
    }
}

Upvotes: 8

AJJ
AJJ

Reputation: 7703

You can use a DOM parser with HTML strings, it is not necessary to have a complete HTML document. http://simplehtmldom.sourceforge.net/

Upvotes: 0

Related Questions