Reputation: 334
I have this string:
b🤵♀️🤵♀️b
After removing the smilies and special chars:
$str = preg_replace('/[^ -\x{2122}]\s+|\s*[^ -\x{2122}]/u','',$str);
$str = trim($str);
...
strlen($str);
gives me 8 instead of 2, why and how to fix this?
Upvotes: 0
Views: 49
Reputation: 7703
The regular expression is not sufficient to remove all special characters. A special debugger shows which characters are still present after the preg_replace.
"b\u{200d}\u{200d}b"
or as 8 bytes
"b\xe2\x80\x8d\xe2\x80\x8db"
The characters \u{200d} are in the original string between the emojis. Removing these characters for the specific example here is not difficult.
$str = preg_replace('/[^ -\x{2122}]\s+|\s*[^ -\x{2122}]|\x{200d}/u','',$str);
However, this is not a solution if other special characters can also occur.
Upvotes: 2