Coddo
Coddo

Reputation: 334

PHP - strlen() doesn't work correctly after removing emojis from string

I have this string:

b🤵‍♀️🤵‍♀️b

After removing the smilies and special chars:

$str = preg_replace('/[^ -\x{2122}]\s+|\s*[^ -\x{2122}]/u','',$str);
$str = trim($str);

...

strlen($str);

gives me 8 instead of 2, why and how to fix this?

Upvotes: 0

Views: 49

Answers (1)

jspit
jspit

Reputation: 7703

The regular expression is not sufficient to remove all special characters. A special debugger shows which characters are still present after the preg_replace.

"b\u{200d}\u{200d}b"

or as 8 bytes

"b\xe2\x80\x8d\xe2\x80\x8db"

The characters \u{200d} are in the original string between the emojis. Removing these characters for the specific example here is not difficult.

$str = preg_replace('/[^ -\x{2122}]\s+|\s*[^ -\x{2122}]|\x{200d}/u','',$str);

However, this is not a solution if other special characters can also occur.

Upvotes: 2

Related Questions