Howard
Howard

Reputation: 4604

How to count text length with special characters?

For example, const words = 'a̋b̋';, the words.length is 4. But we are expecting 2 for "real" length.

Or, is there any safe way to go through all the characters from this above words?

Upvotes: 5

Views: 2281

Answers (1)

T.J. Crowder
T.J. Crowder

Reputation: 1075567

There's nothing built into JavaScript that will help you differentiate those combining marks from other characters. You could build something, of course, using the reference information from http://unicode.org. :-)

...but at least one person seems to have already done so for you: https://github.com/orling/grapheme-splitter

Enter the grapheme-splitter.js library. It can be used to properly split JavaScript strings into what a human user would call separate letters (or "extended grapheme clusters" in Unicode terminology), no matter what their internal representation is. It is an implementation of the Unicode UAX-29 standard.

const words = 'a̋b̋';
const splitter = new GraphemeSplitter();
const graphemes = splitter.splitGraphemes(words);
console.log(graphemes);

That results in two entries in graphemes, "a̋" and "b̋". (Can't do live example, live links to github raw pages are disallowed.)

Upvotes: 3

Related Questions