Salman Khimani
Salman Khimani

Reputation: 515

Utf-8 to UTF-16BE

I save a record "فحص الرسالة العربية" in php that always saved as :

فحص الرسالة العربية

I want to convert this into UTF-16BE chars when i retrieve it so I am using a function that returns :

002600230031003600300031003b002600230031003500380031003b002600230031003500380039003b0020002600230031003500370035003b002600230031003600300034003b002600230031003500380035003b002600230031003500380037003b002600230031003500370035003b002600230031003600300034003b002600230031003500370037003b0020002600230031003500370035003b002600230031003600300034003b002600230031003500390033003b002600230031003500380035003b002600230031003500370036003b002600230031003600310030003b002600230031003500370037003b

This is function that m using for converting string retrieved from database

function convertCharsn($string) {
    $in = '';
    $out = iconv('UTF-8', 'UTF-16BE', $string);
    for($i=0; $i<strlen($out); $i++) {
      $in .= sprintf("%02X", ord($out[$i]));
    }
    return $in;
}

But when i type same character in below url, it shows different characters as compared to my string. http://www.routesms.com/downloads/onlineunicode.asp

returning :

0641062D063500200627064406310633062706440629002006270644063906310628064A0629

I want my string to be converted as it is being converted in above url. my database collation is utf-8_general_ci

Upvotes: 1

Views: 1107

Answers (1)

ircmaxell
ircmaxell

Reputation: 165201

Basically, you need to decode those characters out of HTML entities first. Just use html_entity_decode()

$rawChars = html_entity_decode($string, ENT_QUOTES | ENT_HTML401, 'UTF-8');

convertCharsn($rawChars);

Otherwise, you're just encoding the entities. You can see that as & is 0026 in UTF16, and # is 0023. So you can see the repeating sequence of 00260023 in the above transcoding that you posted. So decode it first, and you should be set...

Upvotes: 2

Related Questions