Jens Törnell
Jens Törnell

Reputation: 24758

xPath insert before and after - With DOM and PHP

I need to add a class to a HTML structure.

My class is called "container" and should start right after <div><ul><li></h4> (the child of ul and its simblings, not grandchilds) and should end right before the closing of the same element.

My whole code looks like this:

<?php
$content = '
    <div class="sidebar-1">
        <ul>
            <li>
                <h4>Title</h4>
                <ul> 
                    <li><a href="http://www.test.com">Test</a></li> 
                    <li><a href="http://www.test.com">Test</a></li> 
                </ul> 
            </li> 
            <li>
                <p>Paragraf</p>
            </li> 
            <li>
                <h4>New title</h4>
                <ul> 
                    <li>Some text</li>
                    <li>Some text åäö</li>
                </ul> 
            </li> 
        </ul>
    </div>
';

$doc = new DOMDocument();
$doc->loadHTML($content);
$x = new DOMXPath($doc);

$start_text = '<div class="container">';
$end_text = '</div>';

foreach($x->query('//div/ul/li') as $anchor)
{
    $anchor->insertBefore(new DOMText($start_text),$anchor->firstChild);
}
echo $doc->saveXML($doc->getElementsByTagName('ul')->item(0));
?>

It works as far as i can add the class opening but not the closing element. I also get strange encoding doing this. I want the output to be the same encoding as the input.

The result should be

    <div class="sidebar-1">
        <ul>
            <li>
                <h4>Title</h4>
                <div class="content">
                    <ul> 
                        <li><a href="http://www.test.com">Test</a></li> 
                        <li><a href="http://www.test.com">Test</a></li> 
                    </ul>
                </div>
            </li> 
            <li>
                <div class="content">
                    <p>Paragraf</p>
                </div>
            </li> 
            <li>
                <h4>New title</h4>
                <div class="content">
                    <ul> 
                        <li>Some text</li>
                        <li>Some text åäö</li>
                    </ul> 
                </div>
            </li> 
        </ul>
    </div>

Upvotes: 1

Views: 4837

Answers (1)

Wiseguy
Wiseguy

Reputation: 20873

I couldn't find a more elegant way to reassign all children, so I guess this will do. I think it gets what you're after, though.

(NOTE: Code updated to reflect additional requirements in the comments.)

$doc = new DOMDocument();
$doc->loadHTML($content);
$x = new DOMXPath($doc);

foreach($x->query('//div/ul/li') as $anchor)
{
    $container = $doc->importNode(new DOMElement('div'));
    $container->setAttribute('class', 'container');

    $next = $anchor->firstChild;
    while ($next !== NULL) {
        $curr = $next;
        $next = $curr->nextSibling;

        if (($curr->nodeName != 'h4')
            || ($curr->attributes === NULL)
            || ($curr->attributes->getNamedItem('class') === NULL)
            || !preg_match('#(^| )title( |$)#', $curr->attributes->getNamedItem('class')->nodeValue)
        ) {
            $container->appendChild($anchor->removeChild($curr));
        }
    }

    $anchor->appendChild($container);
}

As for character encoding, I've been messing with it for a while and it's a tricky issue. The characters display correctly when you load with loadXML() but not with loadHTML(). There's a workaround in the comments, but it ain't pretty. Hopefully some of the user comments will help you can find a usable solution.

Upvotes: 3

Related Questions