DroidOS
DroidOS

Reputation: 8890

Recognising Emojis in typed text

In my hybrid Android/Cordova app I want to allow users to associate an Emoji with a descriptive "handle"/name. I have found that in Android 8+ the default HTML input box - rememember this a hybrid app so the UI is in fact a WebView derived directly from Chrome - it is possible to simply switch to the Emoji keyboard and choose an Emoji. My understanding is that these Emoji's are from Google's Noto font project. The format I want the user to be able to use for entering the emoji + handle is

🏁 handle

where the handle is required to be alphanummeric. Testing the alphanumeric part and the preceding space with a regex is not a problem. However I also want to institute a check that the first two bytes are an Emoji (not obligatory). Once again this can be done by getting the first two characters as userhandle.charCodeAt(0|1).

To check the validity of the numbers thus returned I need to know what constitutes a valid Noto font Emoji code. This article seems to suggest that all valid Emojis should have 0xF09fFor0xE29C` as the value at Char 0 - I am going to ignore the three byte Emojis listed in that resource as being invalid for simplicity.

However, before I implement this I would like to know - is there an established way to validate Emoji unicode that I am unaware of here?

Upvotes: 2

Views: 512

Answers (1)

BaldProgrammer
BaldProgrammer

Reputation: 122

This article gives a lot of details about emojis in javascript and gives a regular expression you can use. I think this works for all emojis, but you will want to test it.

Here I will test for an emoji using the 🍔 emoji:

/(?:[\u2700-\u27bf]|(?:\ud83c[\udde6-\uddff]){2}|[\ud800-\udbff][\udc00-\udfff])[\ufe0e\ufe0f]?(?:[\u0300-\u036f\ufe20-\ufe23\u20d0-\u20f0]|\ud83c[\udffb-\udfff])?(?:\u200d(?:[^\ud800-\udfff]|(?:\ud83c[\udde6-\uddff]){2}|[\ud800-\udbff][\udc00-\udfff])[\ufe0e\ufe0f]?(?:[\u0300-\u036f\ufe20-\ufe23\u20d0-\u20f0]|\ud83c[\udffb-\udfff])?)*/.test(String.fromCodePoint("🍔".codePointAt(0)))  //returns true

Using the same regular expression but testing the "A" character:

/(?:[\u2700-\u27bf]|(?:\ud83c[\udde6-\uddff]){2}|[\ud800-\udbff][\udc00-\udfff])[\ufe0e\ufe0f]?(?:[\u0300-\u036f\ufe20-\ufe23\u20d0-\u20f0]|\ud83c[\udffb-\udfff])?(?:\u200d(?:[^\ud800-\udfff]|(?:\ud83c[\udde6-\uddff]){2}|[\ud800-\udbff][\udc00-\udfff])[\ufe0e\ufe0f]?(?:[\u0300-\u036f\ufe20-\ufe23\u20d0-\u20f0]|\ud83c[\udffb-\udfff])?)*/.test(String.fromCodePoint("A".codePointAt(0)))  //returns false

If you just want to get the codepoint, you can use:

"🍔".codePointAt(0)  //returns 127828

Upvotes: 4

Related Questions