kaese
kaese

Reputation: 10599

How to I remove *some* inline CSS, but not all, using PHP?

A PHP regex / PHP DOM / PHP XPath question.

Given the following HTML with inline CSS:

<p style='text-indent: 22px; font-weight: bold; line-height: 1em; color: #FFF'>

How do I remove the 'line-height' and 'color' CSS properties, and leave text-indent and font-weight untouched, so the resultant HTML is:

<p style='text-indent: 22px; font-weight: bold;'>

The HTML file could be potentially hundreds of lines, with various nesting of tags and other attributes applied to any tag.

Note that the 'style' attribute may be applied to other tags than <p>

I am aware there are approaches using both PHP DOM and regex - my current thinking was using something along these lines:

$elements = $xPath->query('//*[@style="color"]');
foreach ($elements as $element) {   
  //remove style='color'
}

Many thanks

EDIT

Here's my solution:

https://github.com/sabberworm/PHP-CSS-Parser

To create:

$dom = new DOMDocument;
@$dom->loadHTML('<?xml encoding="UTF-8">' . $html);
$xPath = new DOMXPath($dom);
$elements = $xPath->query('//p|//span');
foreach($elements as $element){
    $oParser = new CSSParser("p{" . $element->getAttribute('style') . "}");
    $oCss = $oParser->parse();
    foreach($oCss->getAllRuleSets() as $oRuleSet) {
        $oRuleSet->removeRule('line-');
        $oRuleSet->removeRule('margin-');
        $oRuleSet->removeRule('font-');
    }
    $css = $oCss->__toString();
    $css = substr_replace($css, '', 0, 3);
    $css = substr_replace($css, '', -1, 1);
    $element->setAttribute('style', $css);
}
$src = $dom->saveHTML();

Upvotes: 0

Views: 855

Answers (1)

outis
outis

Reputation: 77400

Definitely use proper HTML and CSS parsers rather than regexes. For the XPath query, use the contains function to find the nodes to alter:

//*[contains(@style, 'color:')]

Then use a CSS parser to remove the properties you don't want.

Upvotes: 3

Related Questions