Reputation: 860
I have a string that receives an XML structure. One of the elements contains Chinese characters. In order to covert the XML to json, I use json_encode(). The output for the Chinese characters is garbled.
I tried checking the encoding with mb_detect_encoding and even tried the solution listed here.
I've googled around (a lot) and found numerous other resources but none of them seems to solve my problem. Any help is much appreciated.
Code:
<?php
$str = <<<XML
<?xml version="1.0" encoding="UTF-8"?>
<rootjson>
<widget>
<debug>on</debug>
<text>
<data>點擊這裡</data>
<size>36</size>
<alignment>center</alignment>
</text>
</widget>
</rootjson>
XML;
$xml = simplexml_load_string($str);
if ($encoding = mb_detect_encoding($xml, 'UTF-8', true)) echo 'XML is utf8'; //It finds it to be utf8
$json = json_encode($xml, JSON_PRETTY_PRINT);
if ($encoding = mb_detect_encoding($json, 'UTF-8', true)) echo 'Json is utf8'; //It also finds it to be utf8
var_dump($json);
?>
Output:
{
"widget": {
"debug": "on",
"text": {
"data": "\u9ede\u64ca\u9019\u88e1",
"size": "36",
"alignment": "center"
}
}
}
I don't think I can trust the mb_detect_encoding here as it is telling that both $xml and $json are UTF-8 encoded. The Chinese string 點擊這裡 is now showing as
\u9ede\u64ca\u9019\u88e1
.
Upvotes: 2
Views: 1612
Reputation: 5787
What you need is JSON_UNESCAPED_UNICODE, see the documentation at php.net/manual/en/function.json-encode.php
Upvotes: 2