Reputation: 3701
I want to check all the tags under the body and check and remove if it has style attribute I have tried
$user_submitted_html = "This is Some Text";
$html = '<body>' . $user_submitted_html . '</body>';
$dom = new DOMDocument();
$dom->loadHTML($html_string);
$elements = $dom->getElementsByTagName('body');
foreach($elements as $element) {
foreach($element->childNodes as $child) {
if($child->hasAttribute('style')) {
$child->removeAttribute('style')
}
}
}
It works fine if $user_submitted_html
is not only text, mean if it has some tags in it, but If it is only text then It gives the error
Call to undefined method DOMText::hasAttribute()
Then I get the nodeName in the foreach loop
echo "Node Name: " . $child->nodeName
It gives the
Node Name = #text
What kind of node name is this, I have echo'ed other nodes, it gives, div, span etc. that I am familiar with. I want to know that which are the elements that hasAttribute does not belong to them so I can put a condition before using the hasAttribute like this
if($child->nodeName=="#text") {
continue; // skip to next iteration
}
if($child->hasAttribute('style')) {
.
.
.
OR any Other Solution???
One More Suggestion Required. What If I remove only the style attributes from <div>,<span>,<p> and <a>
. Will it be safe from xss, if the rest of the tags can use style attribute.
Upvotes: 0
Views: 1704
Reputation: 6449
You can use XPath for get only the elements with style
attribute
$xpath = new DOMXPath($dom);
$elements = $xpath->query('//[@style]');
foreach($elements as $e) {
$e->removeAttribute('style')
}
Upvotes: 0
Reputation: 352
I think instead of checking the nodeName it would be better to check the class $child is an instance of.
if ( $child instanceof DOMElement )
{
//do your stuff
}
Upvotes: 1