Weblurk
Weblurk

Reputation: 6822

Remove non-<img> tags from string

I'm trying to make this:

<span class="introduction">
   <img alt="image" src="/picture.jpg" />
</span>

transform into this:

<img alt="image" src="/picture.jpg" />

How would I do this with regex? That is, how do I extract ONLY the img-tag from a given string of html?

Note: There can be a lot more html within the introduction-tag BUT only one img-tag

Upvotes: 1

Views: 18494

Answers (5)

Hovo
Hovo

Reputation: 790

I've come to this solution

/<img ([^>"']*("[^"]*"|'[^']*')?[^>"']*)*>/

tested on

<other html elements like span or whatever><img src="asd>qwe" attr1='asd>qwe' attr2='as"dqwe' attr3="as'dqwe" ></other html elements like span or whatever>

Upvotes: 1

472084
472084

Reputation: 17894

You shouldn't really use regex on HTML, what about this:?

$string = '<span class="introduction"><img alt="image" src="/picture.jpg" /></span>';

echo strip_tags($string, '<img>');

Otherwise I would use an HTML/XML parser

Upvotes: 9

Gordon
Gordon

Reputation: 317197

Use DOM and this XPath:

//span[@class="introduction"]/img

to find all img elements that are direct children of any span element with a class attribute of introduction.

Upvotes: 1

nullmark
nullmark

Reputation: 86

preg_match('#(<img.*?>)#', $string, $results);

should work, result in $results[1]

Upvotes: 5

Kent
Kent

Reputation: 195269

how about

"<img[^>]*>"

try with grep

kent$  echo '<span class="introduction">
quote>    <img alt="image" src="/picture.jpg" />
quote> </span>
quote> '|grep -P "<img[^>]*>"
   <img alt="image" src="/picture.jpg" />

Upvotes: 6

Related Questions