Reputation: 17013
Unicode characters (code points) not in the Basic Multilingual Plane (BMP) may consist of two chars (code units), called a surrogate pair.
'ab' is two code units and two code points. (So two chars and two characters.)
'a💩' is three code units and two code points. (So three chars and two characters.)
My code does not need to work with old versions of JavaScript. ES6 or whatever is most modern.
How can I access the last character, irrespective of whether its an Astral character or not?
Splitting the string into "all but last character" and "final character" is also fine.
Upvotes: 1
Views: 310
Reputation: 17013
I knew from answers on other SO questions that both Array.from()
and regular expressions with the /u
flag would both correctly handle non-BMP Unicode characters, but I didn't think either was likely to be the best answer.
Maybe I was wrong, so here are two solutions:
Array.from()
let c = Array.from('a💩')[1];
console.log(c);
u
flag
let c ='a💩'.match(/.$/u)[0];
console.log(c);
This second approach can be extended to answer the second part of my question too:
let [,l,r] = 'abcd💩'.match(/(.*)(.)/u);
console.log(l);
console.log(r);
(No anchor needed as the .*
will be greedy.)
Upvotes: 1
Reputation: 21911
Spreading will dissect a string into its code points
[...'a💩'].pop()
Upvotes: 2