PX Developer
PX Developer

Reputation: 8145

PHP: Filter some html attributes using preg_replace. Get the first </span>, not the last

I'm trying to filter an HTML file to remove some attributes. Specifically, I want to remove ALL spans except the ones that set a color. In the spans that set a color I will remove all the attributes except the style='color...'.

I.e., if I have:

<span lang=EN-US>This is a </span>
<span id="myspan" style='color:red;text-align:left;'>test</span>
<span lang=EN-US> to remove spans.</span>

I want it to be:

This is a
<span style='color:red'>test</span>
to remove spans.

To do this I'm using preg_replace. I created this regex:

preg_replace(
    '%(<span [^>]*color\:)([a-z]*)(;|\')([^>]*>)(.*)(<\/span>)%s', 
    "<qwerty style='color:$2'>$5</qwerty>", 
$myText);

After using this, I remove all spans with strip_tags and then I turn all <qwerty> to <span>.

My problem is that the content between <span> and </span> ((.*) in my regex) is getting all the text until the end:

This is a 
<span style='color:red'>test
to remove spans.</span>

I want it to get all the text until it finds the first </span>, but now it gets all the text until the last </span>. How can I do this?

Thanks!

Upvotes: 3

Views: 496

Answers (1)

Marek
Marek

Reputation: 7433

Use ungreedy (U) modifier:

preg_replace('%....%sU', .....);

Upvotes: 1

Related Questions