Patrick Murray
Patrick Murray

Reputation: 108

PHP: Remove all tags that contain the given class or id

I am in need of some help. I have looked into regex, but have not yet completely understand it's implementation. I am in need of a snippet that will remove all tags, and their children, if the parent contains the given classes or ids.

Example:

<?php

function remove_tag($find="",$html)
{
    # Remove multiple #IDs and classes at once

    # When given a string (separating objects with a comma)
    if (is_string($find))
    {
        $objects = explode(',', str_replace(' ', '', $find);
    } else if (is_array($find)) {
        $objects = $find;
    }

    foreach ($objects as $object)
    {
        # If ID
        if (substr($object,0,1) == '#')
        {
            # regex to remove an id
            # Ex: '<ANYTAG [any number of attributes] id='/"[any number of ids] NEEDLE [any number of ids]'/" [any number of attributes]>[anything]</ENDTAG [anything]>'

        }

        if (substr($object,0,1) == '.')
        {
            # remove a class
            # Ex: '<ANYTAG [any number of attributes] class='/"[any number of classes] NEEDLE [any number of classes]'/" [any number of attributes]>[anything]</ENDTAG [anything]>'
        }

        # somehow remove it from the $html variable?
    }
}

Sorry if this is a newbie question, thank you for your time! :)

-Pat

Upvotes: 0

Views: 826

Answers (1)

Lumbendil
Lumbendil

Reputation: 2916

You can use, instead of regex, XPath to find all the elements in a document which you want to remove.

DOMDocument and XPath would seem like a good start to me.

You can use DOMNode::removeChild() method to remove a child, and DOMXPath class to evaluate an XPath, to obtain the nodes you need to remove.

Upvotes: 2

Related Questions