cheeseus
cheeseus

Reputation: 473

PHP array_replace on special condition (advanced transliteration)

I am trying to edit a transliteration function to meet a special condition. I am using a Russian into Latin transliteration that I have adapted for Bulgarian, but there is the special condition that if the letter "я", normally transliterated as "ya", is found at the end of a word, and if preceded by the letter "и", it is transliterated as "a".

Example: without the special condition, "мистерия" is transliterated as "misteriya", whereas the correct transliteration (according to the Bulgarian Law on Transliteration) is "misteria".

What I've tried:

  1. Split the source string into words.
  2. Check if the word is longer than 2 characters.
  3. Check if the last letter is "я".
  4. Check if the letter before it is "и".
  5. If all of the above conditions are met, replace the values for "я" and "Я" (lowercase and uppercase) in the conversion chart array with their new values.
  6. Concatenate all converted words into a string and output it.
  7. It doesn't work.

    function transliterator($string) {
    $converter = array(
    'а' => 'a',   'б' => 'b',   'в' => 'v',
    'г' => 'g',   'д' => 'd',   'е' => 'e',
    'ж' => 'zh',  'з' => 'z',   'и' => 'i',
    'й' => 'y',   'к' => 'k',   'л' => 'l',
    'м' => 'm',   'н' => 'n',   'о' => 'o',
    'п' => 'p',   'р' => 'r',   'с' => 's',
    'т' => 't',   'у' => 'u',   'ф' => 'f',
    'х' => 'h',   'ц' => 'ts',  'ч' => 'ch',
    'ш' => 'sh',  'щ' => 'sht', 'ь' => '',
    'ъ' => 'a',   'ю' => 'yu',  'я' => 'ya',
    
    'А' => 'A',   'Б' => 'B',   'В' => 'V',
    'Г' => 'G',   'Д' => 'D',   'Е' => 'E',
    'Ж' => 'Zh',  'З' => 'Z',   'И' => 'I',   
    'Й' => 'Y',   'К' => 'K',   'Л' => 'L',   
    'М' => 'M',   'Н' => 'N',   'О' => 'O',   
    'П' => 'P',   'Р' => 'R',   'С' => 'S',   
    'Т' => 'T',   'У' => 'U',   'Ф' => 'F',
    'Х' => 'H',   'Ц' => 'Ts',  'Ч' => 'Ch',
    'Ш' => 'Sh',  'Щ' => 'Sht', 'Ь' => '',
    'Ъ' => 'A',   'Ю' => 'Yu',  'Я' => 'Ya',
    );
    
    $words = explode(" ", $string);
    $trans_string = "";
    foreach($words as $word) {
    
    if((strlen($word > 2)) && (strpos($word, "я", -1)) && (strpos($word, "и", -2))) {
        $amend = array("я" => "a", "Я" => "A");
        $converter = array_replace($converter, $amend);
    }
    $trans_word = strtr($word, $converter);
    $trans_string .= $trans_word." ";
    
    }
    return $trans_string;
    }
    

Some help please?

Upvotes: 0

Views: 187

Answers (1)

trincot
trincot

Reputation: 350252

The problem is that strpos and strtr do not offer multibyte support. For strpos you could fix this by using mb_strpos or even better rewrite the condition as (mb_subtr($word, -2) == 'ия').

Still, you'll have problems with the rest of the function. I would suggest to use preg_replace_callback instead; with the /u modifier you'll have multibyte support:

function transliterator($string) {
    return preg_replace_callback('/ия\b|./uis', function ($c) {
        $converter = array(
        'а' => 'a',   'б' => 'b',   'в' => 'v',
        'г' => 'g',   'д' => 'd',   'е' => 'e',
        'ж' => 'zh',  'з' => 'z',   'и' => 'i',
        'й' => 'y',   'к' => 'k',   'л' => 'l',
        'м' => 'm',   'н' => 'n',   'о' => 'o',
        'п' => 'p',   'р' => 'r',   'с' => 's',
        'т' => 't',   'у' => 'u',   'ф' => 'f',
        'х' => 'h',   'ц' => 'ts',  'ч' => 'ch',
        'ш' => 'sh',  'щ' => 'sht', 'ь' => '',
        'ъ' => 'a',   'ю' => 'yu',  'я' => 'ya',

        'А' => 'A',   'Б' => 'B',   'В' => 'V',
        'Г' => 'G',   'Д' => 'D',   'Е' => 'E',
        'Ж' => 'Zh',  'З' => 'Z',   'И' => 'I',   
        'Й' => 'Y',   'К' => 'K',   'Л' => 'L',   
        'М' => 'M',   'Н' => 'N',   'О' => 'O',   
        'П' => 'P',   'Р' => 'R',   'С' => 'S',   
        'Т' => 'T',   'У' => 'U',   'Ф' => 'F',
        'Х' => 'H',   'Ц' => 'Ts',  'Ч' => 'Ch',
        'Ш' => 'Sh',  'Щ' => 'Sht', 'Ь' => '',
        'Ъ' => 'A',   'Ю' => 'Yu',  'Я' => 'Ya',

        'ия' => 'ia', 'ИЯ' => 'IA' // add this!
        );  
        $c = reset($c); // we just need the first element of that array
        return isset($converter[$c]) ? $converter[$c] : $c;
    }, $string);
}

Upvotes: 1

Related Questions