Victor
Victor

Reputation: 621

How to set new HTML tag in custom class that extends DOMElement (using DOMDocument in php)?

I need a fast HTML parser, written in php. First I've tried some existing parsers (like Ganon or QueryPath) but they were very slow for my project. Finally I've decided to use the php built-in DOMDocument, being the fastest of all. It has just some bare methods. So I had to start to build my own.

I'm writing a class thats extends DOMElement. New methods like 'addText' are working fine but I have a problem when I want to change the tag name.

In order to change the tag name, the node has to be replaced. It is another node. After this any further actions will not affect the node anymore.

UPDATE: For now, I've added a return $newNode; in the newTag method and I'm using it like this: $node = $node->newTag('h1'); but for consistency I would really like to use just: $node->newTag('h1');

Please see the code (simplified):

        <?php


        class my_element extends DOMElement {

            public function __construct() { parent::__construct();}

            public function newTag($newTagName) {

                $newNode = $this->ownerDocument->createElement($newTagName);
                $this->parentNode->replaceChild($newNode, $this);

                foreach ($this->attributes as $attribute) {
                    $newNode->setAttribute($attribute->name, $attribute->value);
                }
                foreach (iterator_to_array($this->childNodes) as $child) {
                    $newNode->appendChild($this->removeChild($child));
                }
                //at this point, $newnode should become $this... How???


            }

            //append plain text
            public function addText ($text = '') {
                $textNode = $this->ownerDocument->createTextNode($text);
                $this->appendChild($textNode);
            }

            //... some other methods
        }


        $html = '<div><p></p></div>';

        $dom = new DOMDocument;
        $dom->loadHTML($html);
        $xPath = new DOMXPath($dom);
        $dom->registerNodeClass("DOMElement", "my_element"); //extend DOMElement class

        $nodes = $xPath->query('//p'); //select all 'p' nodes
        $node = $nodes->item(0); // get the first


    //Start to change the selected node
    $node->addText('123');
    $node->newTag('h1');
    $node->addText('345'); //This is not working because the node has changed!

    echo $dom->saveHTML();

This code will output <div><h1>123</h1></div> As you can see, the text 345 was not added after I have changed the tag name.

What can be done in order to continue to work with the selected node? Is it possible to set the new node as the current node in the 'newTag' method?

Upvotes: 0

Views: 356

Answers (1)

Alf Eaton
Alf Eaton

Reputation: 5463

The ideal solution would be DOMDocument::renameNode(), but it isn't available in PHP yet.

Perhaps this would work instead, called as $node = $node->parentNode->renameChild($node, 'h1'):

<?php

class MyDOMNode extends DOMNode {
    public function renameChild($node, $name) {
        $newNode = $this->ownerDocument->createElement($name);

        foreach ($node->attributes as $attribute) {
            $newNode->setAttribute($attribute->name, $attribute->value);
        }

        while ($node->firstChild) {
            $newNode->appendChild($node->firstChild);
        }

        $this->replaceChild($newNode, $node);

        return $newNode;
    }
}

Upvotes: 1

Related Questions