Escape HTML Chars In the Pre Tag

Question

I've installed a syntax highlighter, but in order for it to work, the tags must be written as < and >. What I need to do is replace all <'s with < and >'s with > but only inside the PRE tag.

So, in short, I want to escape all HTML characters inside of the pre tag.

Thanks in advance.

Konrad Rudolph · Accepted Answer

tl;dr

You need to parse the input HTML. Use the DOMDocument class to represent your document, parse the input, find all

 tags (using findElementsByTagName) and escape their content.

Code

Unfortunately, the DOM model is very low-level and forces you to iterate the child nodes of the 
 tag yourself, to escape them. This looks as follows:

function escapeRecursively($node) {
    if ($node instanceof DOMText)
        return $node->textContent;

    $children = $node->childNodes;
    $content = "<$node->nodeName>";
    for ($i = 0; $i < $children->length; $i += 1) {
        $child = $children->item($i);
        $content .= escapeRecursively($child);
    }

    return "$contentnodeName>";
}


Now this function can be used to escape every 
 node in the document:

function escapePreformattedCode($html) {
    $doc = new DOMDocument();
    $doc->loadHTML($html);

    $pres = $doc->getElementsByTagName('pre');
    for ($i = 0; $i < $pres->length; $i += 1) {
        $node = $pres->item($i);

        $children = $node->childNodes;
        $content = '';
        for ($j = 0; $j < $children->length; $j += 1) {
            $child = $children->item($j);
            $content .= escapeRecursively($child);
        }
        $node->nodeValue = htmlspecialchars($content);
    }

    return $doc->saveHTML();
}


Test

$string = 'Test
 Some interesting text';
echo escapePreformattedCode($string);


Yields:


Test
 Some <em>interesting</em> text


Note that a DOM always represents a complete document. Hence when the DOM parser gets a document fragment it fills in the missing information. This makes the output potentially different from the input.

Escape HTML Chars In the Pre Tag

Answers (1)

tl;dr

Code

Test

Related Questions