Anilkumar Kanneboina
Anilkumar Kanneboina

Reputation: 51

How to count Indic Script telugu Characters

I have some JavaScript that counts the total number of characters in a text box. It's fine with english, but when I type Telugu script it shows the wrong count. For example,

Anil = 4
అనిల్ = 4

But అనిల్ is only three letters of Telugu script. How can I count indic script characters exactly?

Upvotes: 5

Views: 1278

Answers (1)

georg
georg

Reputation: 215049

I don't know anything about Telugu, so the following might be completely wrong. Let me know.

"అనిల్".split("") prints ["అ", "న", "ి", "ల", "్"] for me. The characters #2 and #4 appear to be combining marks rather than letters. We only want to count actual letters, so let's remove everything that isn't called a Telugu Letter in http://www.unicode.org/Public/UNIDATA/UnicodeData.txt and count the rest:

str = "అనిల్"
len = str.replace(/[^\u0C05-\u0C39\u0C58-\u0C61]/g, '').length

returns "3" as expected.

Upvotes: 5

Related Questions