Reputation: 756
I have a column in MySQL that contains a wide variety of characters from many different languages and scripts, and I saw elsewhere that utf8mb4_unicode_ci
was a good all-purpose encoding for this type of data, so I have that set as the collation for that column.
When I use the data to create an AJAX response in XML, however, I'm getting an encoding error regardless of what I set the XML header to. I have tried setting the XML encoding thus:
<?xml version="1.0" encoding="utf8mb4_unicode_ci"?>
But I then get an Unsupported encoding
error. I'll admit to being a bit mystified by how encoding works, and dealing with it in multiple languages makes it worse. Is there a way to specify this encoding for XML, a way to translate from utf8mb4_unicode_ci
to something XML understands, or should I use a different collation setting in the MySQL database?
EDIT: Maybe I'm making a mistake unrelated to encoding? Here is what I have. Given this MySQL database:
CREATE TABLE composer (name VARCHAR(32) COLLATE utf8mb4_unicode_ci);
INSERT INTO composer (name) VALUES('Béla Bartók'),('Antonín Dvořák');
...and this PHP file:
<?php
$mysqli = new $mysqli("localhost", $username, $password, $database);
header("Content-Type: text/xml");
print("<?xml version=\"1.0\" encoding=\"utf-8\"?><root>");
$res = $mysqli->query("SELECT * FROM composer");
$res->data_seek(0);
while ($row = $res->fetch_assoc()) {
print("<name>" . $row['name'] . "</name>");
}
print("</root>");
mysqli_close($mysqli);
Opening this result in my browser gives an encoding error, and peeking at the resulting the web inspector window shows placeholder characters in place of the accented characters. Is this just because of the tool (Safari) I'm using to peek at the XML file? Should I just sally forth into parsing the XML response?
Upvotes: 1
Views: 555
Reputation: 97672
utf8mb4_unicode_ci
is what MySQL calls UTF-8 for the full 4-byte range.
So in your xml just specify <?xml version="1.0" encoding="utf-8"?>
Also try using $mysqli->set_charset("utf8");
after connection to set the connection charset to utf8
Upvotes: 1