a_fan
a_fan

Reputation: 387

PHP: Arabic characters as array keys

I want to implement a simple Arabic to English transliteration. I have defined a mapping array like the following:

$mapping = array('ﺏ' => 'b', 'ﺕ' => 't', ...)

I expect the following code to convert an Arabic string to its corresponding transliteration

$str = "رضي الدين";
$strlen = mb_strlen( $str, "UTF-8" );
for( $i = 0; $i <= $strlen; $i++ ) {
    $char = mb_substr( $str, $i, 1, "UTF-8" );
    echo bin2hex($char); // 'd8b1' for ﺭ
    // echo $mapping["$char"];
}

But $char does not match the keys. How can this be solved?

The source code is loaded in UTF-8.

EDIT

When I do bin2hex() on each key of $mapping I get values different than that I get with corresponding $char. For example, for I get efbaad and d8b1. They obviously don't match and they are not converted.

foreach ($mapping as $k => $v) {
    echo $k . ' ' . bin2hex($k) . '<br>'; // 'efbaad' for ﺭ
}

Only 'ي' gets same values and is converted.

I do not know what's the problem!

EDIT2

This chart actually shows that both of these codes refer to

Upvotes: 0

Views: 2309

Answers (2)

Larry.Z
Larry.Z

Reputation: 3724

The problem is that you didn't specify the encoding to both mb_strlen() and mb_substr(); the following works okay:

$str = "رضي الدين";
$mapping = array('ﺏ' => 'b', 'ﺕ' => 't', 'ر' => c);
$strlen = mb_strlen( $str, "UTF-8" );
for( $i = 0; $i <= $strlen; $i++ ) {
    $char = mb_substr( $str, $i, 1 , "UTF-8");
    echo $mapping["$char"];
}

Upvotes: 2

Alma Do
Alma Do

Reputation: 37365

I suggest you to use preg engine since it natively works well with UTF-8. mb_* is not a bad choice, of cause, but I think it's just more complicated.

I've made a sample for your case:

$sData     = "رضي الدين";
$rgReplace = [
  'ﺏ' => 'b', 
  'ﺕ' => 't', 
  'ن' => 'n', 
  'ي' => 'i', 
  'د' => 'f', 
  'ل' => 'l', 
  'ا' => 'a',
  'ر' => 'r', 
  'ي' => 'i',
  'ض' => 'g',
  ' ' => ' '
];
$sResult   = preg_replace_callback('/./u', function($sChar) use ($rgReplace)
{
   return $rgReplace[$sChar[0]];
}, $sData);
echo $sResult; //rgi alfin

as for your code - try to pass encoding directly (second parameter in mb_* functions)

Upvotes: 2

Related Questions