Ivan
Ivan

Reputation: 876

Convert text to and from Serbian cyrillic letters

How do I add the Serbian Cyrilic alphabet into my HTML, to make my browser recognize it?

I need to, for example, make “Povrce” into “Поврће”.

I just need a code so when I type “Поврће” or “Povrće”, the browser can show it.

Upvotes: 5

Views: 4657

Answers (4)

vvnikola1
vvnikola1

Reputation: 1

I've made some additional implementation for specific cases. I hope it helps.

 transcribeCyrillic(p_value: any) {
        if (!this.getSelectedScript()) return p_value;
        else
          var m_to_fix = new Array(
            'a',
            'b',
            'v',
            'g',
            'd',
            'đ',
            'e',
            'ž',
            'z',
            'i',
            'j',
            'k',
            'l',
            'lj',
            'm',
            'n',
            'nj',
            'o',
            'p',
            'r',
            's',
            't',
            'ć',
            'u',
            'f',
            'h',
            'c',
            'č',
            'dž',
            'š',
            'A',
            'B',
            'V',
            'G',
            'D',
            'Đ',
            'E',
            'Ž',
            'Z',
            'I',
            'J',
            'K',
            'L',
            'LJ',
            'M',
            'N',
            'NJ',
            'O',
            'P',
            'R',
            'S',
            'T',
            'Ć',
            'U',
            'F',
            'H',
            'C',
            'Č',
            'DŽ',
            'Š',
            ' ',
            '.',
            ',',
            '*',
            ':',
            ';',
            '1',
            '2',
            '3',
            '4',
            '5',
            '6',
            '7',
            '8',
            '9',
            '0',
            '',
            '!',
            '?',
            '(',
            ')',
            '/',
            '+',
            '-'
          );

    var m_fixed = new Array(
      'а',
      'б',
      'в',
      'г',
      'д',
      'ђ',
      'е',
      'ж',
      'з',
      'и',
      'ј',
      'к',
      'л',
      'љ',
      'м',
      'н',
      'њ',
      'о',
      'п',
      'р',
      'с',
      'т',
      'ћ',
      'у',
      'ф',
      'х',
      'ц',
      'ч',
      'џ',
      'ш',
      'А',
      'Б',
      'В',
      'Г',
      'Д',
      'Ђ',
      'Е',
      'Ж',
      'З',
      'И',
      'Ј',
      'К',
      'Л',
      'Љ',
      'М',
      'Н',
      'Њ',
      'О',
      'П',
      'Р',
      'С',
      'Т',
      'Ћ',
      'У',
      'Ф',
      'Х',
      'Ц',
      'Ч',
      'Џ',
      'Ш',
      ' ',
      '.',
      ',',
      '*',
      ':',
      ';',
      '1',
      '2',
      '3',
      '4',
      '5',
      '6',
      '7',
      '8',
      '9',
      '0',
      '',
      '!',
      '?',
      '(',
      ')',
      '/',
      '+',
      '-'
    );

    let dObj: any = {};
    let m_output = '';
    m_to_fix.forEach((m, i) => {
      if (m.length == 2) dObj[m] = m_fixed[i]; //Assign Replacements for double letters in the object
    });

    let doubles = ['l', 'n', 'd', 'L', 'N', 'D']; //Array of 1st letter of all double letters

    for (var i = 0; i < p_value?.length; i += 1) {
      let char = p_value[i];

      if (doubles.includes(char)) {
        //Check if char exist in the doubles array
        if (p_value[i + 1] !== undefined) {
          //only if the next value exists (not undefined)!
          let char2 = p_value[i + 1]; //Get the immediate next charcater.

          if (char + char2 in dObj) {
            //Check if char+char2 exists in the object
            m_output += dObj[char + char2]; //If it is, add the replacement to the output
          } else {
            if (char2 + p_value[i + 2] in dObj) {
              //Check if char2+char2+1 exists in the object (example 'dnj' for poslednja)
              m_output += m_fixed[m_to_fix.indexOf(char)];
              m_output += dObj[char2 + p_value[i + 2]]; //If it is, add the replacement to the output
            } else {
              //edge case for 'њј','љј','џж','ЊЈ','ЉЈ','ЏЖ'
              const criticalChars = ['њј', 'љј', 'џж', 'ЊЈ', 'ЉЈ', 'ЏЖ'];
              criticalChars.some((element) => {
                if (m_output.includes(element)) {
                  m_output = m_output.replace(element, element.slice(0, -1));
                }
              });

              let ind = m_to_fix.indexOf(char); //Else add replacement of each char individually
              m_output += m_fixed[ind];

              ind = m_to_fix.indexOf(char2);
              m_output += m_fixed[ind];
            }
          }
          i += 1; //Manually increase index by 1 since, we also checked for next char (char2) to avoid repition
        } else {
          //if the next value is undefined then do as usual
          let ind = m_to_fix.indexOf(char);
          m_output += m_fixed[ind];
        }
      } else {
        let ind = m_to_fix.indexOf(char); //Else if char doesn't exist in doubles array, get the index of that char from m_to_fix array
        m_output += m_fixed[ind]; //add the respective replacement of from the m_fixed array
      }
    }

Upvotes: 0

drzaus
drzaus

Reputation: 25024

Complete language transliteration mapping from the wikipedia list, including upper and lowercase, just because no one else listed it out. Depending on the direction of transliteration, flip the mapping (currently Cyrillic->Latin).

const langmap = {
    "А": "A",
    "Б": "B",
    "В": "V",
    "Г": "G",
    "Д": "D",
    "Ђ": "Đ",
    "Е": "E",
    "Ж": "Ž",
    "З": "Z",
    "И": "I",
    "Ј": "J",
    "К": "K",
    "Л": "L",
    "Љ": "Lj",
    "М": "M",
    "Н": "N",
    "Њ": "Nj",
    "О": "O",
    "П": "P",
    "Р": "R",
    "С": "S",
    "Т": "T",
    "Ћ": "Ć",
    "У": "U",
    "Ф": "F",
    "Х": "H",
    "Ц": "C",
    "Ч": "Č",
    "Џ": "Dž",
    "Ш": "Š",
    "а": "a",
    "б": "b",
    "в": "v",
    "г": "g",
    "д": "d",
    "ђ": "đ",
    "е": "e",
    "ж": "ž",
    "з": "z",
    "и": "i",
    "ј": "j",
    "к": "k",
    "л": "l",
    "љ": "lj",
    "м": "m",
    "н": "n",
    "њ": "nj",
    "о": "o",
    "п": "p",
    "р": "r",
    "с": "s",
    "т": "t",
    "ћ": "ć",
    "у": "u",
    "ф": "f",
    "х": "h",
    "ц": "c",
    "ч": "č",
    "џ": "dž",
    "ш": "š",
}

function remapLang (str) {
    return str.replace(/[^\u0000-\u007E]/g, function(a){ 
        return langmap[a] || a; 
    });
}

Then eyeball testing:

var tests = [
  "First name: ГЕОРГИ, Last name: КОСТАДИНОВ.",
  // --> First name: GEORGI, Last name: KOSTADINOV.
  "First name: Димитър, Last name: Стоев."
  // --> First name: Dimitъr, Last name: Stoev
];
tests.map(remapLang).forEach(console.log);

I would note that the above tests are real-world examples, so the wiki seems to be missing an equivalent for the "deprecated" (?) 'hard sign' ъ that I guess people still use? YMMV...

Upvotes: 2

nelek
nelek

Reputation: 4312

I made this solution, is little simple but maybe can help You :

var pp='VOĆE POVRĆE DINJA';
var ss=["NJ","V","O","Ć","E","P","R","D","I","A"];
var cyr=["Њ","В","О","Ћ","Е","П","Р","Д","И","А"];
for(var i=0;i<ss.length;i++) {
    var tt=cyr[i];
    pp=pp.replace(new RegExp(ss[i], "g"),tt);
}

There is jsfiddle example, too

Character positions in ss and cyr is important. So, first place chars like lj and nj.

Update : Using textbox, and after lost focus, phrase will be converted. Of course, You have to put all chars in arrays.

function chChar(ele) {
    var pp=ele.value;
    var ss=["NJ","V","O","Ć","E","P","R","D","I","A"];
var cyr=["Њ","В","О","Ћ","Е","П","Р","Д","И","А"];
for(var i=0;i<ss.length;i++) {
    var tt=cyr[i];
    pp=pp.replace(new RegExp(ss[i], "gi"),tt);
}
document.getElementById('cyr').innerHTML=pp;
}
<input type="text" onblur="chChar(this);" /><br>
<div id="cyr"></div>

Upvotes: 1

dakab
dakab

Reputation: 5875

What you mean is transliterating latin to Serbian Cyrillic (or vice versa). That’s no problem, since transliteration is the reversible conversion one character at a time (whereas transcription is phonetic). Just set up an “associative” object with the alphabet, and then map() it accordingly. Here’s some proof of concept:

var latinString = 'Povrce';
var latinToSerbian = { "P":"П", "o":"о", "v":"в", "r":"р", "c":"ћ", "e":"е" /* ... */ };
var serbianString = latinString.split('').map(function(character){
    return latinToSerbian[character];
}).join('');
console.log( latinString + ' = ' + serbianString ); // Povrce = Поврће

For HTML, of course, there are always entities to resort to. Taking a look at the Cyrillic Unicode block, you can easily translate characters to their decimal or hexadecimal code points:

element.innerHTML = '&#1055;&#1086;&#1074;&#1088;&#1115;&#1077;';
element.onclick = function(){ alert('\u041F\u043E\u0432\u0440\u045B\u0435'); };

If you want a on-the-fly transliteration while typing on a website, use charCodeAt(), an <input> element for the typed text and something along the lines of:

var latinToCyrillic = { "80": 1055 /* entire alphabet */ };
var cyrillicToLatin = { "1115" : 263 /* entire alphabet */ };
var toCyrillic = function(character){
    return String.fromCharCode(latinToCyrillic[character.charCodeAt(0)]);
};
var toLatin = function(character){
    return String.fromCharCode(cyrillicToLatin[character.charCodeAt(0)]);
};
console.log(
    toCyrillic('P'), // === "П"
    toLatin('ћ')     // === "ć"
);

Upvotes: 2

Related Questions