Different results between preg_replace & preg_match_all

Question

I have a forum that supports hashtags. I'm using the following line to convert all hashtags into links. I'm using the (^|\(|\s|>) pattern to avoid picking up named anchors in URLs.

$str=preg_replace("/(^|\(|\s|>)(#(\w+))/","$1$2",$str);

I'm using this line to pick up hashtags to store them in a separate field when the user posts their message, this picks up all hashtags EXCEPT those at the start of a new line.

preg_match_all("/(^|\(|\s|>)(#(\w+))/",$Content,$Matches);

Using the m & s modifiers doesn't make any difference. What am I doing wrong in the second instance?

Edit: the input text could be plain text or HTML. Example of problem input:

#startoftextreplacesandmatches #afterwhitespacereplacesandmatches #insidehtmltagreplacesandmatches :)
#startofnewlinereplacesbutdoesnotmatch :(

DaveRandom · Accepted Answer

Your replace operation has a problem which you have evidently not yet come across - it will allow unescaped HTML special characters through. The reason I know this is because your regex allows hashtags to be prefixed with >, which is a special character.

For that reason, I recommend you use this code to do the replacement, which will double up as the code for extracting the tags to be inserted into the database:

$hashtags = array();

$expr = '/(?:(?:(^|[(>\s])#(\w+))|(?P.+?))/';

$str = preg_replace_callback($expr, function($matches) use (&$hashtags) {
    if (!empty($matches['notag'])) {
        // This takes care of HTML special characters outside hashtags
        return htmlspecialchars($matches['notag']);
    } else {
        // Handle hashtags
        $hashtags[] = $matches[2];
        return htmlspecialchars($matches[1]).'#'.htmlspecialchars($matches[2]).'';
    }
}, $str);

After the above code has been run, $str will contain the modified string, properly escaped for direct output, and $hashtags will be populated with all the tags matched.

See it working

Different results between preg_replace & preg_match_all

Answers (1)

Related Questions

Different results between preg_replace &amp; preg_match_all

Answers (1)

Related Questions

Different results between preg_replace & preg_match_all