barnamah
barnamah

Reputation: 59

Arabic text shows strange characters الÙباى انگليسى ØŒ

I have Arabic text (.sql pure text). When I view it in any document, it shows like this:

حر٠اول الÙباى انگليسى ØŒ حر٠اضاÙÙ‡ مثبت

But when I use an HTML document with <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>, it shows properly like this:

حرف اول الفباى انگليسى ، حرف اضافه مثبت

How can I convert it to readable text?

Upvotes: 4

Views: 10836

Answers (2)

ghazi alyasin
ghazi alyasin

Reputation: 471

I have written this function to help you:

function convertToArabic($input) {
    $arabicMap = [
         'ب' => 'ب',
        'ت' => 'ت',
        'Ø«' => 'ث',
        'ج' => 'ج',
        'Ø­' => 'ح',
        'Ø®' => 'خ',
        'د' => 'د',
        'Ø°' => 'ذ',
        'ر' => 'ر',
        'Ù€' => 'ر',
        'ز' => 'ز',
        'س' => 'س',
        'Ø´' => 'ش',
        'ص' => 'ص',
        'ض' => 'ض',
        'Ø·' => 'ط',
        'ظ' => 'ظ',
        'ع' => 'ع',
        'غ' => 'غ',
        'Ù' => 'ف',
        'Ù‚' => 'ق',
        'Ùƒ' => 'ك',
        'Ù„' => 'ل',
        'Ù…' => 'م',
        'Ù†' => 'ن',
        'Ù‡' => 'ه',
        'Ùˆ' => 'و',
        'ÙŠ' => 'ي',
        'ا' => 'ا',
        'Ø¥' => 'إ',
        'ئ' => 'ئ',
        'Ø£' => 'أ',
        'Ø¢' => 'آ',
        'ÙŽ' => 'ـ',
        'Ø¡' => 'ء',
        'Ù‰' => 'ى',
      'Ù' => 'ِ',
       'Ø©' => 'ة',
        'ؤ' => 'ؤ',
    ];

    $output = strtr($input, $arabicMap);
    return $output;
}

Upvotes: 2

Remy Lebeau
Remy Lebeau

Reputation: 597896

The Arabic text has been encoded to bytes using UTF-8.

You are explicitly telling the HTML document that the bytes are encoded in UTF-8, which is why any HTML viewer will be able to display the text correctly.

However, any other text viewer will not know the bytes are encoded in UTF-8, unless you put a UTF-8 BOM in front of the text, and the viewer supports BOMs. Otherwise, as you are seeing, a text viewer may instead interpret the bytes in Latin-1 or similar encoding instead. So, you would have to manually tell the text viewer to interpret the bytes as UTF-8 instead. But how you actually do that depends on the particular text viewer you are using. Not all viewers offer this option.

Upvotes: 4

Related Questions