hippietrail
hippietrail

Reputation: 17013

Get last character of string in current modern Javascript, allowing for Astral characters such as Emoji that use surrogate pairs (two code units)

Unicode characters (code points) not in the Basic Multilingual Plane (BMP) may consist of two chars (code units), called a surrogate pair.

'ab' is two code units and two code points. (So two chars and two characters.)

'a💩' is three code units and two code points. (So three chars and two characters.)

My code does not need to work with old versions of JavaScript. ES6 or whatever is most modern.

How can I access the last character, irrespective of whether its an Astral character or not?

Splitting the string into "all but last character" and "final character" is also fine.

Upvotes: 1

Views: 310

Answers (2)

hippietrail
hippietrail

Reputation: 17013

I knew from answers on other SO questions that both Array.from() and regular expressions with the /u flag would both correctly handle non-BMP Unicode characters, but I didn't think either was likely to be the best answer.

Maybe I was wrong, so here are two solutions:

Array.from()

let c = Array.from('a💩')[1];
console.log(c);

u flag

let c ='a💩'.match(/.$/u)[0];
console.log(c);

This second approach can be extended to answer the second part of my question too:

let [,l,r] = 'abcd💩'.match(/(.*)(.)/u);
console.log(l);
console.log(r);

(No anchor needed as the .* will be greedy.)

Upvotes: 1

Andreas
Andreas

Reputation: 21911

Spreading will dissect a string into its code points

[...'a💩'].pop()

Upvotes: 2

Related Questions