Farshid Ashouri
Farshid Ashouri

Reputation: 17701

comparing words in js with very strange result

I have a persian word that I copied one from a text and write down the other one in my keyboard:

a = 'ﺧﻮاب'
"ﺧﻮاب"

b='خواب'
"خواب"

//lets compare 
a==b
false

Can someone explain me why? (you can test yourself!)

Upvotes: 3

Views: 164

Answers (5)

一二三
一二三

Reputation: 21249

The first two characters of each sequence are different:

  • a: U+FEA7 U+FEEE ...
  • b: U+062E U+0648 ...

The reason why they look the same is that a uses "presentation form" versions of the characters in b, which are used to mark the joining group of the character (e.g. initial, medial or final). In this case, ARABIC LETTER KHAH INITIAL FORM and ARABIC LETTER WAW FINAL FORM. These will have the same visual appearance as the characters in b once shaped by a font renderer (ARABIC LETTER KHAH and ARABIC LETTER WAW).

These presentation form characters in a only exist in Unicode for backwards compatibility (Unicode uses a different mechanism to encode the joining group now), and are canonically equivalent to those in b. The characters in a will be normalised into the characters in b under Normalisation Form C.

Upvotes: 1

Djamel
Djamel

Reputation: 341

The first two characters are different, you can see the difference by running a.split('') and b.split('') in your browser's console.

enter image description here

Upvotes: 2

choz
choz

Reputation: 17868

Their first 2 letters are different characters.

var a = 'ﺧﻮاب';
var b = 'خواب';

for ( var i = 0; i < a.length; i++ ){
    console.log(a.charCodeAt(i));
}
for ( var i = 0; i < b.length; i++ ){
    console.log(b.charCodeAt(i));
}

a is [65191, 65262, 1575, 1576]

b is [1582, 1608, 1575, 1576]

Now if I try this code:

var a = 'ﺧﻮاب';
var b = a; // Or you can copy and paste `a` value here.
a == b; // This will return `true`

Upvotes: 3

Jordan Soltman
Jordan Soltman

Reputation: 3883

You can also look at it with a hex editor and see that they have different hex codes. You'll notice that the first two characters are different between the strings.

The first string is: FEA7 FEEE 0627 0628

And the second: 062E 0648 0627 0628

Free hex editor for mac.

Free hex editor for pc.

Upvotes: 1

Saad
Saad

Reputation: 53849

Easiest way to find out differences like this is to paste it into a text editor.

You can see that the characters result in something different:

enter image description here

Upvotes: 1

Related Questions