Christian
Christian

Reputation: 436

My regex is not replacing correctly

I have this regex: /(?:(?<=(?:style=["])).*)(line-height.*?)[;"]/i

$regex = '/(?:(?<=(?:style=["])).*)(line-height.*?)[;"]/i';

preg_replace("/(?:(?<=(?:style=[\"'])).*)(line-height.*?)[;\"]/i", "HELLO", $input);

This is the input:

    <li><span style="line-height: 20.14399986267089px">500.00dkk</span></li>
<li style="color:red; line-height: 21.14399986267089px"></li>

I want to replace only the occurrences of "line-height: SOMENUMBERpx" with HELLO (It also has to preceded by the style tag): but I can not make it work correctly. Right now it replaces the line-height properties, but it also replace,: color:red, which I do not want.

This is the output I want:

<li><span style=HELLO>500.00dkk</span></li> 
<li style="color:red; HELLO"></li>

Can anyone see what I am doing incorrectly?

Upvotes: 2

Views: 134

Answers (4)

hek2mgl
hek2mgl

Reputation: 157947

I would use a DOM parser to extract the style attributes and modify the contents using preg_replace():

$input = <<<EOF
<li><span style="line-height: 20.14399986267089px">500.00dkk</span></li>
<li style="color:red; line-height: 21.14399986267089px"></li>
EOF;

# Create a document from input
$doc = new DOMDocument();
$doc->loadHTML($input);

# Create an XPath selector
$selector = new DOMXPath($doc);

# Modify values of the style attributes
foreach($selector->query('//@style') as $style) {
    $style->nodeValue = preg_replace(
        '/line-height:\s*[0-9]+(\.[0-9]+)?px\s*;?/',
        'HELLO;',
        $style->nodeValue
    );
}

# Output the modified document
echo $doc->saveHTML();

The advantage of using DOM and XPath is that you can reliably access the style attributes in any nested level even if HTML content gets freaky. Also it is easy to maintain if the HTML structure changes in future or if you want to specify a littler closer which style attributes should change.

Take the following query for example, it selects only style attributes of <span> tags having a class even and being child (in any nested level) of a div with id="foo".

//div[@id="foo"]//span[contains(@class, "even")]/@style

You'll have a a lot of fun if you try this with a regex! :)


About the CSS part. I decided to use a regex for that since the only thing I could imagine of which could break the regex would be something like:

<span style="background:url('line-height:2px');">

Since line-height:2px is a valid UNIX filename, the above could be possible. But hey! :) If you really care about that you would need to use a CSS parser for that job.

Upvotes: 3

vks
vks

Reputation: 67968

You can use \K here.\K resets the starting point of the reported match. Any previously consumed characters are no longer included in the final match

style=.*?\Kline-height.*?(?=[;"])

Try this.See demo.

This will make sure that only line=height... will be replced and it is preceded by style= as well

Upvotes: 2

Christian
Christian

Reputation: 436

I figured out that I could use the references for each group to insert that group back into the replacement, so I did not loose the color:red part.

preg_replace ('/(?<=style=["])(.*)(line-height.*?)[;"]/', '$1HELLO', $input);

This gave me the desired result.

Upvotes: 0

apgp88
apgp88

Reputation: 985

This is the regular expression you want

(line-height:\s+\d+\.\d+px)

Regular expression visualization

Debuggex Demo

Upvotes: 0

Related Questions