khernik
khernik

Reputation: 2091

Converting unicode codes to UTF-8

I have the following string written in a file:

\u0410\u043d\u0442\u043e\u043d \u0411\u043e\u0440\u0438\u0441\u0435\u043d\u043a\u043e

And I want to replace it with UTF-8 readable characters. How can I do this?

The file itself is in UTF-8. The string in unicode codes is cyrillic.

I have tried utf_encode(), json_decode(), mb functions...nothing works.

EDIT:

This is what I have tried:

echo html_entity_decode(preg_replace("/U\+([0-9A-F]{4})/", "&#x\\1;", '\u0410\u043d\u0442\u043e\u043d \u0411\u043e\u0440\u0438\u0441\u0435\u043d\u043a\u043e'), ENT_NOQUOTES, 'UTF-8') . '<br>';
echo utf8_encode('\u0410\u043d\u0442\u043e\u043d \u0411\u043e\u0440\u0438\u0441\u0435\u043d\u043a\u043e') . '<br>';
echo json_decode('"' . '\u0410\u043d\u0442\u043e\u043d \u0411\u043e\u0440\u0438\u0441\u0435\u043d\u043a\u043e' . '"'); 
        die();

And the output is:

\u0410\u043d\u0442\u043e\u043d \u0411\u043e\u0440\u0438\u0441\u0435\u043d\u043a\u043e
\u0410\u043d\u0442\u043e\u043d \u0411\u043e\u0440\u0438\u0441\u0435\u043d\u043a\u043e
ĐĐ˝ŃОн ĐĐžŃиŃонкО

Upvotes: 2

Views: 2361

Answers (2)

Benoit Esnard
Benoit Esnard

Reputation: 2075

Using json_decode function:

<?php

$str = '\u0410\u043d\u0442\u043e\u043d \u0411\u043e\u0440\u0438\u0441\u0435\u043d\u043a\u043e';
$str = json_decode('"' . $str . '"');

header('Content-Type: text/plain; charset=utf-8');
echo $str;

Output: Антон Борисенко

Upvotes: 1

splash58
splash58

Reputation: 26153

Гугл вам в помощь :) In Google we trust

function decodeUnicode($s, $output = 'utf-8') 
{ 
    return preg_replace_callback('#\\\\u([a-fA-F0-9]{4})#', function ($m) use ($output) { 
        return iconv('ucs-2be', $output, pack('H*', $m[1])); 
    }, $s); 
} 

echo decodeUnicode('\u0410\u043d\u0442\u043e\u043d \u0411\u043e\u0440\u0438\u0441\u0435\u043d\u043a\u043e', 'windows-1251');

result:

Антон Борисенко

Upvotes: 2

Related Questions