AzizStark
AzizStark

Reputation: 1504

How to get the correct element from a unicode string?

I want to get specific letters from an unicode string using index. However, it doesn't work as expected.

Example:

var handwriting = `𝖆𝖇𝖈𝖉𝖊𝖋𝖌𝖍𝖎𝖏𝖐𝖑𝖒𝖓𝖔𝖕𝖖𝖗𝖘𝖙𝖚𝖛𝖜𝖝𝖞𝖟𝕬𝕭𝕮𝕯𝕰𝕱𝕲𝕳𝕴𝕵𝕶𝕷𝕸𝕹𝕺𝕻𝕼𝕽𝕾𝕿𝖀𝖁𝖂𝖃𝖄𝖅1234567890`
var normal = `abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890` 

console.log(normal[3]) // gives 'd' but
console.log(handwriting[3]) // gives '�' instead of '𝖉'

also length doesn't work as expected normal.length gives correct value as 62 but handwriting.length gives 114.

Indexing doesn't work as expected. How can I access the elements of unicode array?

I tried this on python it works perfectly but in Javascript it is not working.

I need exact characters from the unicode string like an expected output of 'd' '𝖉' for index 3

Upvotes: 0

Views: 273

Answers (2)

adiga
adiga

Reputation: 35222

In Javascript, a string is a sequence of 16-bit code points. Since these characters are encoded above the Basic Multilingual Plane, it means that they are represented by a pair of code points, also known as a surrogate pair.

Reference

Unicode number of 𝖆 is U+1D586. And 0x1D586 is greater than 0xFFFF (2^16). So, 𝖆 is represented by a pair of code points, also known as a surrogate pair

console.log("𝖆".length)
console.log("𝖆" === "\uD835\uDD86")

One way is to create an array of characters using the spread syntax or Array.from() and then get the index you need

var handwriting = `𝖆𝖇𝖈𝖉𝖊𝖋𝖌𝖍𝖎𝖏𝖐𝖑𝖒𝖓𝖔𝖕𝖖𝖗𝖘𝖙𝖚𝖛𝖜𝖝𝖞𝖟𝕬𝕭𝕮𝕯𝕰𝕱𝕲𝕳𝕴𝕵𝕶𝕷𝕸𝕹𝕺𝕻𝕼𝕽𝕾𝕿𝖀𝖁𝖂𝖃𝖄𝖅1234567890`

console.log([...handwriting][3])
console.log(Array.from(handwriting)[3])

Upvotes: 3

Loïck M
Loïck M

Reputation: 358

A unicode character looks like '\u00E9' so if your string is longer this is normal. To have the real length of a unicode string, you have to convert it to an array :

let charArray = [...handwriting]
console.log(charArray.length) //=62

Each item of your array is a char of your string. charArray[3] will return you the unicode char corresponding to '𝖉'

Upvotes: 2

Related Questions