Reputation: 108
I am in need of some help. I have looked into regex, but have not yet completely understand it's implementation. I am in need of a snippet that will remove all tags, and their children, if the parent contains the given classes or ids.
Example:
<?php
function remove_tag($find="",$html)
{
# Remove multiple #IDs and classes at once
# When given a string (separating objects with a comma)
if (is_string($find))
{
$objects = explode(',', str_replace(' ', '', $find);
} else if (is_array($find)) {
$objects = $find;
}
foreach ($objects as $object)
{
# If ID
if (substr($object,0,1) == '#')
{
# regex to remove an id
# Ex: '<ANYTAG [any number of attributes] id='/"[any number of ids] NEEDLE [any number of ids]'/" [any number of attributes]>[anything]</ENDTAG [anything]>'
}
if (substr($object,0,1) == '.')
{
# remove a class
# Ex: '<ANYTAG [any number of attributes] class='/"[any number of classes] NEEDLE [any number of classes]'/" [any number of attributes]>[anything]</ENDTAG [anything]>'
}
# somehow remove it from the $html variable?
}
}
Sorry if this is a newbie question, thank you for your time! :)
-Pat
Upvotes: 0
Views: 826
Reputation: 2916
You can use, instead of regex, XPath to find all the elements in a document which you want to remove.
DOMDocument and XPath would seem like a good start to me.
You can use DOMNode::removeChild()
method to remove a child, and DOMXPath
class to evaluate an XPath, to obtain the nodes you need to remove.
Upvotes: 2