Vinaya Maheshwari
Vinaya Maheshwari

Reputation: 575

How to convert unicode string to character

I want to convert unicode to character.

For this I tried https://stackoverflow.com/a/7278961/4073217 post but it is not working for me.

For Example :

$string = '%u0391%u03b8%u03b1%u03bd%u03b1%u03c3%u03af%u03bf%u03c5';
$string = preg_replace('/%u([0-9A-F]+)/', '&#x$1;', $string);
echo html_entity_decode($string, ENT_COMPAT, 'UTF-8');

Output should be Αθανασίου but above method returning Αb8b1bdb1c3afbfc5.

Am I doing anything wrong? How do I get correct characters from Unicode in php?

Upvotes: 0

Views: 1256

Answers (2)

Anushil Nandan
Anushil Nandan

Reputation: 294

<?php
header('Content-type: text/html; charset=utf-8');

$string = '%u0391%u03b8%u03b1%u03bd%u03b1%u03c3%u03af%u03bf%u03c5';
$string = preg_replace('/%u([0-9a-f]+)/', '&#x$1;', $string);

echo html_entity_decode($string, ENT_COMPAT, 'UTF-8');

$arr = [
'to_email' => '[email protected]',
'from_email' => '[email protected]',
'subject' => 'utf',
'message' => $string
];

mail_send($arr);

function mail_send($arr)
{
    if (!isset($arr['to_email'], $arr['from_email'], $arr['subject'], $arr['message'])) {
        throw new HelperException('mail(); not all parameters provided.');
    }

    $to            = empty($arr['to_name']) ? $arr['to_email'] : '"' . mb_encode_mimeheader($arr['to_name']) . '" <' . $arr['to_email'] . '>';
    $from        = empty($arr['from_name']) ? $arr['from_email'] : '"' . mb_encode_mimeheader($arr['from_name']) . '" <' . $arr['from_email'] . '>';

    $headers    = array
    (
        'MIME-Version: 1.0',
        'Content-Type: text/html; charset="UTF-8";',
        'Content-Transfer-Encoding: 7bit',
        'Date: ' . date('r', $_SERVER['REQUEST_TIME']),
        'Message-ID: <' . $_SERVER['REQUEST_TIME'] . md5($_SERVER['REQUEST_TIME']) . '@' . $_SERVER['SERVER_NAME'] . '>',
        'From: ' . $from,
        'Reply-To: ' . $from,
        'Return-Path: ' . $from,
        'X-Mailer: PHP v' . phpversion(),
        'X-Originating-IP: ' . $_SERVER['SERVER_ADDR'],
    );

    mail($to, '=?UTF-8?B?' . base64_encode($arr['subject']) . '?=', $arr['message'], implode("\n", $headers));
}

this will print out Αθανασίου in browser and email Αθανασίου in email

Upvotes: 1

Anushil Nandan
Anushil Nandan

Reputation: 294

the regex:

$string = preg_replace('/%u([0-9A-F]+)/', '&#x$1;', $string) has A-F which means it will only compare capital A-F characters after 0-9. Since you have all small characters match is failing. Try:

$string = preg_replace('/%u([0-9a-f]+)/', '&#x$1;', $string);

instead.

Also check if browser output is utf-8. if not you can use header:

header('Content-type: text/html; charset=utf-8');

before echoing output

Upvotes: 4

Related Questions