Lunar
Lunar

Reputation: 57

How to parse HTML using PHP and display as plain text all tags within <code></code> except <br> tag

I have a problem to solve but I'm not able to do it, then I beg for your help! In fact, it's all about a blog posting form. When they publish their articles, the post blog is converted by htmlentities an stored in the DB.

htmlentities(ucfirst($var), ENT_QUOTES, 'utf-8');

When it comes to display the text, they use the function html_entity_decode.

$var = html_entity_decode($var, ENT_QUOTES, 'UTF-8');

Now I want to be able to display tags within html tag < code > regardless the programming language used (PHP, HTML, Java, Javascript, ...).

For example I have the following code:

<?php

$html = <<<EOD
    <h1>Escape HTML or Other Programming tags</h1>
    <p>This must be rendered without any problem</p>
    <p>
    <code style="display:block;background:rgb(230,230,230);padding:2%">
        <h4>Show this title in HTML</h4>
        <p>This paragraph must be in <strong>HTML</strong> and this <a href="">Link</a> too !</p>
        <?php 
            echo "Display this PHP code!";
        ?>
        <script>
            alert("Don't pop-up please!");
        </script>
    </code>
    <a href="">Link rendered</a>
    </p> 
    EOD;

Look at this image to see what I want as output: show tags within HTML code tag as plain text

As you can see, all tags are displayed as plain text except < br > tag

My thoughts:
I guess I have to parse this HTML code to find code tags and then convert codes within that tag in plain text, but I'm not sure to know how to this correctly.

$dom = new DOMDocument; 
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
foreach ($xpath->query("//code") as $node) {
    $node = htmlspecialchars($node->nodeValue);
}
$texte = $dom->saveHTML();
echo $texte;

Can you please help me to achieve my objective?

Upvotes: 0

Views: 3374

Answers (2)

SirPilan
SirPilan

Reputation: 4837

You can convert every node within code to plain text like this:

Havent found a way to preserve the formatting though.

$dom = new DOMDocument; 
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);

foreach ($xpath->query("//code") as $codeNode) {
    $codeContent = '';

    while ($codeNode->hasChildNodes()) {
        $codeChild = $codeNode->firstChild;
        $codeContent .= $dom->saveHTML($codeChild);
        $codeNode->removeChild($codeChild);
    }

    $codeNode->textContent = $codeContent;
}

$texte = $dom->saveHTML();

Working example.


Another possible solution is to replace the code element by an pre element, which keeps the formatting:

foreach ($xpath->query("//code") as $codeNode) {
    $codeContent = '';

    foreach($codeNode->childNodes as $codeChild) {
        $codeContent .= $dom->saveHTML($codeChild);
    }

    $preformattedCodeNode->textContent = $codeContent;
    $preformattedCodeNode->setAttribute('style', $codeNode->getAttribute('style'));

    $codeNode->parentNode->replaceChild($preformattedCodeNode, $codeNode);
}

Working example.

Upvotes: 1

uingtea
uingtea

Reputation: 6524

Do not use html_entity_decode() if you want to display it in <code>, you also need to replace newline \n with <br> and space with &nbsp;

<?php

$code = <<<EOD
&lt;h4&gt;Show this title in HTML&lt;/h4&gt;
&lt;p&gt;This paragraph must be in &lt;strong&gt;HTML&lt;/strong&gt; and this &lt;a href=&quot;&quot;&gt;Link&lt;/a&gt; too !&lt;/p&gt;
&lt;?php 
    echo &quot;Display this PHP code!&quot;;
?&gt;
&lt;script&gt;
    alert(&quot;Don't pop-up please!&quot;);
&lt;/script&gt;
EOD;

$code = str_replace("\n","<br>", $code);
$code = str_replace(" ","&nbsp;", $code);

$html = <<<EOD
    <h1>Escape HTML or Other Programming tags</h1>
    <p>This must be rendered without any problem</p>
    <p>
    <code style="display:block;background:rgb(230,230,230);padding:2%">
    $code    
    </code>
    <a href="">Link rendered</a>
    </p> 
EOD;

echo $html;

Upvotes: 0

Related Questions