Reputation: 473
I am trying to edit a transliteration function to meet a special condition. I am using a Russian into Latin transliteration that I have adapted for Bulgarian, but there is the special condition that if the letter "я", normally transliterated as "ya", is found at the end of a word, and if preceded by the letter "и", it is transliterated as "a".
Example: without the special condition, "мистерия" is transliterated as "misteriya", whereas the correct transliteration (according to the Bulgarian Law on Transliteration) is "misteria".
What I've tried:
It doesn't work.
function transliterator($string) {
$converter = array(
'а' => 'a', 'б' => 'b', 'в' => 'v',
'г' => 'g', 'д' => 'd', 'е' => 'e',
'ж' => 'zh', 'з' => 'z', 'и' => 'i',
'й' => 'y', 'к' => 'k', 'л' => 'l',
'м' => 'm', 'н' => 'n', 'о' => 'o',
'п' => 'p', 'р' => 'r', 'с' => 's',
'т' => 't', 'у' => 'u', 'ф' => 'f',
'х' => 'h', 'ц' => 'ts', 'ч' => 'ch',
'ш' => 'sh', 'щ' => 'sht', 'ь' => '',
'ъ' => 'a', 'ю' => 'yu', 'я' => 'ya',
'А' => 'A', 'Б' => 'B', 'В' => 'V',
'Г' => 'G', 'Д' => 'D', 'Е' => 'E',
'Ж' => 'Zh', 'З' => 'Z', 'И' => 'I',
'Й' => 'Y', 'К' => 'K', 'Л' => 'L',
'М' => 'M', 'Н' => 'N', 'О' => 'O',
'П' => 'P', 'Р' => 'R', 'С' => 'S',
'Т' => 'T', 'У' => 'U', 'Ф' => 'F',
'Х' => 'H', 'Ц' => 'Ts', 'Ч' => 'Ch',
'Ш' => 'Sh', 'Щ' => 'Sht', 'Ь' => '',
'Ъ' => 'A', 'Ю' => 'Yu', 'Я' => 'Ya',
);
$words = explode(" ", $string);
$trans_string = "";
foreach($words as $word) {
if((strlen($word > 2)) && (strpos($word, "я", -1)) && (strpos($word, "и", -2))) {
$amend = array("я" => "a", "Я" => "A");
$converter = array_replace($converter, $amend);
}
$trans_word = strtr($word, $converter);
$trans_string .= $trans_word." ";
}
return $trans_string;
}
Some help please?
Upvotes: 0
Views: 187
Reputation: 350252
The problem is that strpos
and strtr
do not offer multibyte support. For strpos
you could fix this by using mb_strpos
or even better rewrite the condition as (mb_subtr($word, -2) == 'ия')
.
Still, you'll have problems with the rest of the function. I would suggest to use preg_replace_callback
instead; with the /u
modifier you'll have multibyte support:
function transliterator($string) {
return preg_replace_callback('/ия\b|./uis', function ($c) {
$converter = array(
'а' => 'a', 'б' => 'b', 'в' => 'v',
'г' => 'g', 'д' => 'd', 'е' => 'e',
'ж' => 'zh', 'з' => 'z', 'и' => 'i',
'й' => 'y', 'к' => 'k', 'л' => 'l',
'м' => 'm', 'н' => 'n', 'о' => 'o',
'п' => 'p', 'р' => 'r', 'с' => 's',
'т' => 't', 'у' => 'u', 'ф' => 'f',
'х' => 'h', 'ц' => 'ts', 'ч' => 'ch',
'ш' => 'sh', 'щ' => 'sht', 'ь' => '',
'ъ' => 'a', 'ю' => 'yu', 'я' => 'ya',
'А' => 'A', 'Б' => 'B', 'В' => 'V',
'Г' => 'G', 'Д' => 'D', 'Е' => 'E',
'Ж' => 'Zh', 'З' => 'Z', 'И' => 'I',
'Й' => 'Y', 'К' => 'K', 'Л' => 'L',
'М' => 'M', 'Н' => 'N', 'О' => 'O',
'П' => 'P', 'Р' => 'R', 'С' => 'S',
'Т' => 'T', 'У' => 'U', 'Ф' => 'F',
'Х' => 'H', 'Ц' => 'Ts', 'Ч' => 'Ch',
'Ш' => 'Sh', 'Щ' => 'Sht', 'Ь' => '',
'Ъ' => 'A', 'Ю' => 'Yu', 'Я' => 'Ya',
'ия' => 'ia', 'ИЯ' => 'IA' // add this!
);
$c = reset($c); // we just need the first element of that array
return isset($converter[$c]) ? $converter[$c] : $c;
}, $string);
}
Upvotes: 1