Me hdi
Me hdi

Reputation: 1912

Compare Persian characters in PHP

I have two Persian word are the same, but they do not match together, why? What do i do for match them together in php.(Of course, this is an example)

DEMO: https://3v4l.org/u5sUa

    $wordd1='فريدونكنار';
    $wordm2='فریدونکنار';
    if($wordd1 == $wordm2){
        echo 'ok'; //i want this result
    }else{
        echo 'no';
    }

Upvotes: 2

Views: 933

Answers (5)

Majid Askari
Majid Askari

Reputation: 590

Use a function to replace similar looking characters then compare two strings.

function replaceSimilarChars($srting)
{

    $srting = str_replace('ي', 'ی', $srting);
    $srting = str_replace('ك', 'ک', $srting);
    // any other replacement 
    return $srting;
}

Upvotes: 0

Moradnejad
Moradnejad

Reputation: 3653

There are two characters in Persian that each have two different character values, where the second value come from Arabic characters.

First is ی and ي.

The other one is ک and ك.

You have to replace all occurrences of the second one with the first one.

one example code is: $str = str_replace('ی','ي',$str);

Upvotes: 1

Hikmat Sijapati
Hikmat Sijapati

Reputation: 6994

$wordd1='فريدونكنار';
$wordm2='فریدونکنار';
$result=strcmp($wordd1,$wordm2);
       if($result===0){
    echo 'ok'; 
}else{
    echo 'no';
}

The strcmp() function compares two strings. The strcmp() function is binary-safe and case-sensitive.this function returns 0, the two strings are equal.

Upvotes: 1

user149341
user149341

Reputation:

Those strings appear similar, but they are not equal!

The first string contains the characters:

U+641  'ف'  ARABIC LETTER FEH
U+631  'ر'  ARABIC LETTER REH
U+64A  'ي'  ARABIC LETTER YEH       <- 1
U+62F  'د'  ARABIC LETTER DAL
U+648  'و'  ARABIC LETTER WAW
U+646  'ن'  ARABIC LETTER NOON
U+643  'ك'  ARABIC LETTER KAF       <- 2
U+646  'ن'  ARABIC LETTER NOON
U+627  'ا'  ARABIC LETTER ALEF
U+631  'ر'  ARABIC LETTER REH

The second string contains the characters:

U+641  'ف'  ARABIC LETTER FEH
U+631  'ر'  ARABIC LETTER REH
U+6CC  'ی'  ARABIC LETTER FARSI YEH <- 1
U+62F  'د'  ARABIC LETTER DAL
U+648  'و'  ARABIC LETTER WAW
U+646  'ن'  ARABIC LETTER NOON
U+6A9  'ک'  ARABIC LETTER KEHEH     <- 2
U+646  'ن'  ARABIC LETTER NOON
U+627  'ا'  ARABIC LETTER ALEF
U+631  'ر'  ARABIC LETTER REH

The characters in the third and seventh positions (marked as <- 1 and <- 2) are not the same.

Upvotes: 3

Sammitch
Sammitch

Reputation: 32232

I don't know how your language works, but it seems you've got look-alike characters in your string.

function illustrate_bytes($str1, $str2) {
    for( $i=0; $i<strlen($str1); $i++ ) {
        printf("%02x %08d : %02x %08d : %s\n",
            ord($str1[$i]), decbin(ord($str1[$i])),
            ord($str2[$i]), decbin(ord($str2[$i])),
            $str1[$i] === $str2[$i] ? 'same' : 'diff');
    }
}

illustrate_bytes('ﻑﺮﻳﺩﻮﻨﻜﻧﺍﺭ', 'ﻑﺭیﺩﻮﻧکﻥﺍﺭ');

Output:

d9 11011001 : d9 11011001 : same
81 10000001 : 81 10000001 : same
d8 11011000 : d8 11011000 : same
b1 10110001 : b1 10110001 : same
d9 11011001 : db 11011011 : diff
8a 10001010 : 8c 10001100 : diff
d8 11011000 : d8 11011000 : same
af 10101111 : af 10101111 : same
d9 11011001 : d9 11011001 : same
88 10001000 : 88 10001000 : same
d9 11011001 : d9 11011001 : same
86 10000110 : 86 10000110 : same
d9 11011001 : da 11011010 : diff
83 10000011 : a9 10101001 : diff
d9 11011001 : d9 11011001 : same
86 10000110 : 86 10000110 : same
d8 11011000 : d8 11011000 : same
a7 10100111 : a7 10100111 : same
d8 11011000 : d8 11011000 : same
b1 10110001 : b1 10110001 : same

So the look-alikes are:

  • \xd9\x8a:"ي" and \xdb\x8c:"ی"
  • \xd9\x83:"ك" and \xda\xa9:"ک"

Upvotes: 1

Related Questions