Reputation: 13
I want to match one letter or number or symbol inside inline style.
Example:
<html>
<head>
</head>
<body>
<p style="color: #48ad64;font-weight:10px;">hi there</p>
<div style="background-color: #48ad64;">
<h3>perfect</h3>
</div>
</body>
</html>
I want to match any c
or o
or #
or 4
or ;
or -
If we take o
for example, it's supposed to match 5 occurrences.
I want to replace every occurrence within a style declaration using preg_replace()
.
How can I get this? I tried so many different expressions, but none of them did what I want.
Some of what I tried:
/(?:\G(?!^)|\bstyle=")(?:.{0,}?)(o)(?=[^>]*>)/
/(style=")(?:\w+)(o)(([^"]*)")/
I just need the regex to match all o
in my HTML. I expect this:
<html>
<head>
</head>
<body>
<p style="c'o'lor: #48ad64;f'o'nt-weight:10px;">how blabla</p>
<div style="backgr'o'und-c'o'l'o'r: #48ad64;">
<h3>perfect normal o moral bla bal</h3>
</div>
</body>
</html>
I just want all o
occurrences inside inline-style above to be replaced with 'o'
Upvotes: 1
Views: 232
Reputation: 48011
A quick/dirty/simple solution is to use preg_replace_callback()
with str_replace()
.
Pattern: (Demo with Pattern Explanation) /<[^<]+ style="\K.*?(?=">)/
Code: (Demo)
$html='<html>
<head>
</head>
<body>
<p style="color: #48ad64;font-weight:10px;">hi there</p>
<div style="background-color: #48ad64;">
<h3>perfect</h3>
</div>
</body>
</html>';
$needle="o";
echo preg_replace_callback('/<[^<]+ style="\K.*?(?=">)/',function($m)use($needle){return str_replace($needle,"<b>$needle</b>",$m[0]);},$html);
// add the i flag for case-insensitive matching------^ ^-- and add i here for case-insensitive replacing
Output:
<html>
<head>
</head>
<body>
<p style="c<b>o</b>l<b>o</b>r: #48ad64;f<b>o</b>nt-weight:10px;">hi there</p>
<div style="backgr<b>o</b>und-c<b>o</b>l<b>o</b>r: #48ad64;">
<h3>perfect</h3>
</div>
</body>
</html>
This is a pure regex replacement method/pattern:
$needle="o";
// vv-----------vv--make the needle value literal
echo preg_replace('/(?:\G(?!^)|\bstyle=")[^"]*?\K\Q'.$needle.'\E/',"'$needle'",$html);
// assumes no escaped " in style--^^^^ ^^-restart fullstring match
The [^"]*?
component eliminates the need for a lookahead. However, if a font family name (or similar) were to use \"
(escaped double quotes) then replacement accuracy would be negatively impacted.
I wouldn't call either of these methods "robust" because certain substrings of text may trick the pattern into "over-matching" illegitimate style substrings.
To do this properly, I suggest that you use DomDocument or some other html parser to ensure you are only modifying real/true style attributes.
DomDocument Code: (Demo)
$dom = new DOMDocument;
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD); // 2nd params to remove DOCTYPE
$xp = new DOMXpath($dom);
foreach ($xp->query('//*[@style]') as $node) {
$node->setAttribute('style',str_replace($needle,"'$needle'",$node->getAttribute('style'))); // no regex
}
echo $dom->saveHTML();
Upvotes: 2