Reputation: 59
I have Arabic text (.sql
pure text). When I view it in any document, it shows like this:
Øر٠اول الÙباى انگليسى ØŒ Øر٠اضاÙÙ‡ مثبت
But when I use an HTML document with <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
, it shows properly like this:
حرف اول الفباى انگليسى ، حرف اضافه مثبت
How can I convert it to readable text?
Upvotes: 4
Views: 10836
Reputation: 471
I have written this function to help you:
function convertToArabic($input) {
$arabicMap = [
'ب' => 'ب',
'ت' => 'ت',
'Ø«' => 'ث',
'ج' => 'ج',
'Ø' => 'ح',
'Ø®' => 'خ',
'د' => 'د',
'Ø°' => 'ذ',
'ر' => 'ر',
'Ù€' => 'ر',
'ز' => 'ز',
'س' => 'س',
'Ø´' => 'ش',
'ص' => 'ص',
'ض' => 'ض',
'Ø·' => 'ط',
'ظ' => 'ظ',
'ع' => 'ع',
'غ' => 'غ',
'Ù' => 'ف',
'Ù‚' => 'ق',
'Ùƒ' => 'ك',
'Ù„' => 'ل',
'Ù…' => 'م',
'Ù†' => 'ن',
'Ù‡' => 'ه',
'Ùˆ' => 'و',
'ÙŠ' => 'ي',
'ا' => 'ا',
'Ø¥' => 'إ',
'ئ' => 'ئ',
'Ø£' => 'أ',
'Ø¢' => 'آ',
'ÙŽ' => 'ـ',
'Ø¡' => 'ء',
'Ù‰' => 'ى',
'Ù' => 'ِ',
'Ø©' => 'ة',
'ؤ' => 'ؤ',
];
$output = strtr($input, $arabicMap);
return $output;
}
Upvotes: 2
Reputation: 597896
The Arabic text has been encoded to bytes using UTF-8.
You are explicitly telling the HTML document that the bytes are encoded in UTF-8, which is why any HTML viewer will be able to display the text correctly.
However, any other text viewer will not know the bytes are encoded in UTF-8, unless you put a UTF-8 BOM in front of the text, and the viewer supports BOMs. Otherwise, as you are seeing, a text viewer may instead interpret the bytes in Latin-1 or similar encoding instead. So, you would have to manually tell the text viewer to interpret the bytes as UTF-8 instead. But how you actually do that depends on the particular text viewer you are using. Not all viewers offer this option.
Upvotes: 4