kr1zmo
kr1zmo

Reputation: 837

php htmlentities on <code></code> only!

I want to run htmlentities() only on the contents within <code> things to strip </code>

I Wrote a function that takes a string and finds the content in between <code> </code>

function parse_code($string) {

        // test the string and find contents with <code></code>
        preg_match('@<code>(.*?)</code>@s', $string, $matches);

            // return the match if it succeeded
            if($matches) {
                return $matches[1];
            }else {
                return false;
            }
    }

However I need some help with a function that will will actually run htmlentities(); on the contents within <code> </code> and then implode() it all back together. For example lets say we have the string below.

<div class="myclass" rel="stuff"> things in here </div>
<code> only run htmlentites() here so strip out things like < > " ' & </code>
<div> things in here </div>

Once again, The function would need to keep all contents of the string the same but only modify and run htmlentities() on the contents of <code> </code>

Upvotes: 2

Views: 225

Answers (1)

mario
mario

Reputation: 145512

You can simplify this with using a custom callback function:

$html = preg_replace_callback(
     "@  (?<= <code>)  .*?  (?= </code>)  @six",
     "htmlentities_cb", $html
);

function htmlentities_cb($matches) {
    return htmlentities($matches[0], ENT_QUOTES, "UTF-8");
}

The syntax for matching the enclosing code tags is called lookbehind and lookahead assertion. It simplifies the callback and avoids implode() later, because the assertion matches itself do not become part of $matches[0]. While @six is for case-insensitve tag matches and allows whitespace in the regex to make it more readable.

Upvotes: 5

Related Questions