chocolata
chocolata

Reputation: 3338

Wrap all HTML tags between h3 tag sets with DOMDocument in PHP

I've got a follow up question to my question that has been answered by Jack: Wrap segments of HTML with divs (and generate table of contents from HTML-tags) with PHP

I've been trying to add some functionality to the answer above, in order to get the following result.

This is my present HTML:

<h3>Subtitle</h3>
<p>This is a paragraph</p>
<p>This is another paragraph</p>
<h3>Another subtile
  <h3>
    <p>Yet another paragraph</p>

This is what I would like to achieve:

<h3 class="current">Subtitle</h3>
<div class="ac_pane" style="display:block;">
  <p>This is a paragraph</p>
  <p>This is another paragraph</p>
</div>
<h3>Another subtitle</h3>
<div class="ac_pane">
  <p>Yet another paragraph</p>
</div>

I've been trying to modify the code out of the example above, but can't figure it out:

foreach ($d->getElementsByTagName('h3') as $h3) {
    $ac_pane_nodes = array($h3);
    for ($next = $h3->nextSibling; $next && $next->nodeName != 'h3'; $next = $next->nextSibling) {
        $ac_pane_nodes[] = $next;
    }
    $ac_pane = $d->createElement('div');
    $ac_pane->setAttribute('class', 'ac_pane');
    // Here I'm trying to wrap all tags between h3-sets, but am failing!
            $h3->parentNode->appendChild($ac_pane, $h3);
    foreach ($ac_pane_nodes as $node) {
        $ac_pane->appendChild($node);
    }
}

Please note that the addition of class="current" to the first h3 set, and the addition of style="display:block;" to the first div.ac_pane is optional, but would be very much appreciated.

Upvotes: 2

Views: 1558

Answers (1)

matb33
matb33

Reputation: 2820

As requested, here is a working version. IMO XSLT is still the solution most appropriate to this type of problem (transforming some XML into other XML, really) but I have to admit grouping with regular code is much easier!

I ended up extending the DOM API slightly just to add a utility insertAfter method on DOMElement. It could have been done without it, but it's neater:

UPDATED TO WRAP DIV AROUND ALL TAGS AS REQUESTED IN COMMENTS

<?php

class DOMDocumentExtended extends DOMDocument {
    public function __construct($version = "1.0", $encoding = "UTF-8") {
        parent::__construct($version, $encoding);
        $this->registerNodeClass("DOMElement", "DOMElementExtended");
    }
}

class DOMElementExtended extends DOMElement {
    public function insertAfter($targetNode) {
        if ($targetNode->nextSibling) {
            $targetNode->parentNode->insertBefore($this, $targetNode->nextSibling);
        } else {
            $targetNode->parentNode->appendChild($this);
        }
    }

    public function wrapAround(DOMNodeList $nodeList) {
        while (($node = $nodeList->item(0)) !== NULL) {
            $this->appendChild($node);
        }
    }
}

$doc = new DOMDocumentExtended();
$doc->loadHTML(
    "<h3>Subtitle</h3>
    <p>This is a paragraph</p>
    <p>This is another paragraph</p>
    <h3>Another subtile</h3>
    <p>Yet another paragraph</p>"
);

// Grab a nodelist of all h3 tags
$nodeList = $doc->getElementsByTagName("h3");

// Iterate over each of these h3 nodes
foreach ($nodeList as $index => $h3) {

    // Special handling for first h3
    if ($index === 0) {
        $h3->setAttribute("class", "current");
    }

    // Create a div node that we'll use as our wrapper
    $div = $doc->createElement("div");
    $div->setAttribute("class", "ac_pane");

    // Special handling for first div wrapper
    if ($index === 0) {
        $div->setAttribute("style", "display:block;");
    }

    // Move next siblings of h3 until we hit another h3
    while ($h3->nextSibling && $h3->nextSibling->localName !== "h3") {
        $div->appendChild($h3->nextSibling);
    }

    // Add the div node right after the h3
    $div->insertAfter($h3);
}

// UPDATE: wrap all child nodes of body in a div
$div = $doc->createElement("div");
$body = $doc->getElementsByTagName("body")->item(0);
$div->wrapAround($body->childNodes);
$body->appendChild($div);

echo $doc->saveHTML();

Note that loadHTML will add doctype, html and body nodes. They can be stripped out if needed.

Upvotes: 4

Related Questions