Al3bed
Al3bed

Reputation: 142

How to Convert Arabic Characters to Unicode Using PHP

I want to to know how can I convert a word into unicode exactly like: http://www.arabunic.free.fr/

can anyone know how to do that using PHP considering that Arabic text may contains ligatures?

thanks

Edit

I'm not sure what is that "unicode" but I need to have the Arabic Character in it's equivalent machine number considering that arabic characters have different contextual forms depending on their position - see here:

http://en.wikipedia.org/wiki/Arabic_alphabet#Table_of_basic_letters

the same character in different position:

ب‎ | ـب‎ | ـبـ‎ | بـ‎

I think it must be a way to convert each Arabic character into it's equivalent number, but how?

Edit

I still believe there's a way to convert each character to it's form depending on positions

any idea is appreciated..

Upvotes: 10

Views: 21613

Answers (7)

Bashar Haidar
Bashar Haidar

Reputation: 11

I had a similar problem when I wanted to store an object that had values in Arabic, so writing in Arabic was stored as UNICODE," so the solution was as follows.

$detailsLog = $product->only(['name', 'unit', 'quantity']);
$detailsLog = json_encode($detailsLog, JSON_UNESCAPED_UNICODE);
$log->details = $detailsLog;
$log->save();

When you put the second parameter of the json_encode JSON_UNESCAPED_UNICODE follower, the Arabic words return without encoding.

Upvotes: 1

Ostoura
Ostoura

Reputation: 2592

I'm totally agree with FloatBird about the use of the arabic.php which you will find it as he said at ar-php, The thing is they have changed the class name after version 4 from Arabic to I18N_Arabic so in order for the code to work using arabic.php ver 4.0 you need to change the code to

<?php
include('Arabic.php');
$Arabic = new I18N_Arabic('ArGlyphs');

$text = 'بسم الله الرحمن الرحيم';
$text = $Arabic->utf8Glyphs($text);
echo $text;
?>

Also notice that you need to put the php code file inside the I18N folder.

Anyway it is working fantastically, Thanks again FloatBird

Upvotes: 0

Jamil Hneini
Jamil Hneini

Reputation: 553

i think you could try:

<meta charset="utf-8" />

if this does not work use FloatBird Answer

Upvotes: 0

FloatBird
FloatBird

Reputation: 216

All what you need is function called: utf8Glyphs which you can find it in ArGlyphs.class.php download it from ar-php and visit Ar-PHP for the ArPHP more information about the project and classes.

This will reverse the word with same of its characters (glyphs).

Example of usage:

    <?php
    include('Arabic.php');
    $Arabic = new Arabic('ArGlyphs');

    $text = 'بسم الله الرحمن الرحيم';
    $text = $Arabic->utf8Glyphs($text);
    echo $text;
    ?>

Upvotes: 15

everplays
everplays

Reputation: 153

i assume you wnat to convert بهروز to \u0628\u0647\u0631\u0648\u0632 take a look at http://hsivonen.iki.fi/php-utf8/ all you have to do after calling unicodeToUtf8('بهروز') is to convert integers you got in array to hex & make sure they have 4digigts & prefix em with \u & you're done. also you can get same using json_encode

json_encode('بهروز') // returns "\u0628\u0647\u0631\u0648\u0632"

EDIT:

seems you want to get character codes of بب which first one differs from second one, all you have to do is applying bidi algorithm on your text using fribidi_log2vis then getting character code by one of ways i said before.

here's example:

$string = 'بب'; // \u0628\u0628
$bidiString = fribidi_log2vis($string, FRIBIDI_LTR, FRIBIDI_CHARSET_UTF8);
json_encode($bidiString); // \ufe90\ufe91

EDIT:

i just remembered that tcpdf has bidi algorithm which implemented using pure php so if you can not get fribidi extension of php to work, you can use tcpdf (utf8Bidi by default is protected so you need to make it public)

require_once('utf8.inc'); // http://hsivonen.iki.fi/php-utf8/
require_once('tcpdf.php'); // http://www.tcpdf.org/
$t = new TCPDF();
$text = 'بب';
$t->utf8Bidi(utf8ToUnicode($text)); // will return an array like array(0 => 65168, 1 => 65169)

Upvotes: 4

yPhil
yPhil

Reputation: 8387

Just set the element containing the arabic text to "rtl" (right to left), then input correctly spelled arabic and the text will flow with all ligatures looked for.

div { direction:rtl; }

On a side note, don't forget to read "The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)"

Think about that : The "ba" (ب) arabic letter is a "ba" no matter where it appears in the sentence.

Upvotes: 3

Otto
Otto

Reputation: 4200

Try this:

<?php
    $string = 'a';
    $expanded = iconv('UTF-8', 'UTF-32', $string);
    $arr = unpack('L*', $expanded);
    print_r($arr);
?>

Upvotes: 1

Related Questions