kwichz
kwichz

Reputation: 2453

Strange whitespace missing on a regex

I have this kind of regex on PHP :

$str="first word https://www.helloz.it last word";
$str=preg_replace(
    '#[^"](((http|https|ftp)://)[^\s\n]+)#',
    '<a class="lforum" href="$1">$1</a>',
    $str);
echo nl2br($str);

And the output that I'll expect is :

first word <a class="lforum" href="https://www.helloz.it">https://www.helloz.it</a> last word

but in fact the output is :

first word<a class="lforum" href="https://www.helloz.it">https://www.helloz.it</a> last word

(notice the missing whitespace between first word and <a class...

Where is it vanished that whitespace? :) Thanks

Upvotes: 0

Views: 222

Answers (2)

lonesomeday
lonesomeday

Reputation: 237975

[^"] says "match a character that isn't "". A space character is matched by this, so it is replaced by your regex.

Use a negative lookbehind instead:

'#(?<!")(((http|https|ftp)://)[^\s\n]+)#',

This says "match the string if it doesn't follow a quotation mark". Preceding characters are not therefore included in your matched content.

See regular-expressions.info for information about lookbehinds.

Upvotes: 1

Johan Soderberg
Johan Soderberg

Reputation: 2740

[^"] matches the whitespace and you replace the entire match which removes the whitespace. Put it within () and put it back first in the new string.

Upvotes: 1

Related Questions