Thomas L.G
Thomas L.G

Reputation: 81

preg_replace regex to remove stray end tag

I have a string containing different types of html tags and stuff, including some <img> elements. I am trying to wrap those <img> elements inside a <figure> tag. So far so good using a preg_replace like this:

preg_replace( '/(<img.*?>)/s','<figure>$1</figure>',$content); 

However, if the <img>tag has a neighboring <figcaption> tag, the result is rather ugly, and produces a stray end tag for the figure-element:

<figure id="attachment_9615">
<img class="size-full" src="http://www.example.com/pic.png" alt="name" width="1699" height="354" />
<figcaption class="caption-text"></figure>Caption title here</figcaption>
</figure> 

I've tried a whole bunch of preg_replace regex variations to wrap both the img-tag and figcaption-tag inside figure, but can't seem to make it work.

My latest try:

preg_replace( '/(<img.*?>)(<figcaption .*>*.<\/figcaption>)?/s',
'<figure">$1$2</figure>',
$content); 

Upvotes: 2

Views: 719

Answers (1)

Jan
Jan

Reputation: 43169

As others pointed out, better use a parser, i.e. DOMDocument instead. The following code wraps a <figure> tag around each img where the next sibling is a <figcaption>:

<?php

$html = <<<EOF
<html>
    <img class="size-full" src="http://www.example.com/pic.png" alt="name" width="1699" height="354" />
    <figcaption class="caption-text">Caption title here</figcaption>

    <img class="size-full" src="http://www.example.com/pic.png" alt="name" width="1699" height="354" />

    <img class="size-full" src="http://www.example.com/pic.png" alt="name" width="1699" height="354" />
    <figcaption class="caption-text">Caption title here</figcaption>
</html>
EOF;

$dom = new DOMdocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);

# get all images
$imgs = $xpath->query("//img");

foreach ($imgs as $img) {
    if ($img->nextSibling->tagName == 'figcaption') {

        # create a new figure tag and append the cloned elements
        $figure = $dom->createElement('figure');
        $figure->appendChild($img->cloneNode(true));
        $figure->appendChild($img->nextSibling->cloneNode(true));

        # insert the newly generated elements right before $img
        $img->parentNode->insertBefore($figure, $img);

        # and remove both the figcaption and the image from the DOM
        $img->nextSibling->parentNode->removeChild($img->nextSibling);
        $img->parentNode->removeChild($img);

    }
}
$dom->formatOutput=true;
echo $dom->saveHTML();

See a demo on ideone.com.

To have a <figure> tag around all your images, you might want to add an else branch:

} else {
    $figure = $dom->createElement('figure');
    $figure->appendChild($img->cloneNode(true));
    $img->parentNode->insertBefore($figure, $img);

    $img->parentNode->removeChild($img);
}

Upvotes: 2

Related Questions