AlexioVay
AlexioVay

Reputation: 4527

Count character length of emoji?

I want to validate the name when a new user signs up at my page. One of those checks is if the character limit isn't above 100.

But since one single emoji like 👩‍❤️‍💋‍👩 (those are actually 4 emoji together? see screenshot) count much more than 1 character I have issues to validate the name. I want to allow emoji in the name, because these days it's quite common to have a heart, star or something similar there, but I don't want to allow names with more than 100 characters.

So I have this question:

PS: I'm talking about a php solution, but I would alternatively accept Javascript too, even if I don't prefer it.

Edit: My example emoji seems to be this string: \ud83d\udc69\u200d\u2764\ufe0f\u200d\ud83d\udc8b\u200d\ud83d\udc69

Please notice the mentioned screenshot of this question:

The screenshot of this question, please notice the emoji output.

Upvotes: 19

Views: 12130

Answers (2)

Evan Rusackas
Evan Rusackas

Reputation: 614

As a potential javascript solution (if you don't mind adding a library), Lodash has tackled this problem in their toArray module.

For example,

_.toArray('12👪').length; // --> 3

Or, if you want to knock a few arbitrary characters off a string, you manipulate and rejoin the array, like:

_.toArray("👪trimToEightGlyphs").splice(0,8).join(''); // --> '👪trimToE'

Upvotes: 16

Federkun
Federkun

Reputation: 36989

Unicode defines abstract characters as code points, but what allows for rendering it on screen is the font. A font is a collection of graphical shapes, called glyphs, and they are the visual representation of a code point or a sequence of code points. A sequence of one or more code points that are displayed as a single graphical unit is called grapheme.

If you need to get the length in grapheme units (and NOT characters, like mb_strlen would do), you can use grapheme_strlen:

$emoji = "\u{1F469}\u{200D}\u{2764}\u{FE0F}\u{200D}\u{1F48B}\u{200D}\u{1F469}";
echo $emoji , " : " , strlen($emoji) , "\n"; // 27, count bytes
echo $emoji , " : " , mb_strlen($emoji) , "\n"; // 8, count characters
echo $emoji , " : " , grapheme_strlen($emoji) , "\n"; // 1, count grapheme units

https://3v4l.org/KSSl4

Upvotes: 10

Related Questions