Deciphering MySQL Encoding

Question

I'm having an issue with encoding in MySQL, and I need some help in figuring out what's going on.

First, some parameters. The default encoding of the table is utf8. The character_set_client, character_set_connection, collation_connection, and character_set_server MySQL system variables, though, are all latin1.

I ssh into my MySQL server and I connect to the local server using the local command line client. I select record/column and the string that's returned, let's say the character comes back as A, which is correct. A is represented by hex in UTF-8 as "C5 9F."

However, the PHP app that hits the server interprets it as XY. In the MySQL commandline client, if I send the command "SET NAMES utf8", it will also now display it as XY.

If I do a select INTO OUTFILE and use hexedit to edit the file, I see two hex characters that map to X, then two hex characters that map to Y. ("c3 85" for X and "C5 B8" for Y). Basically, it's taking the two hex values and displaying them indeed as UTF8 characters.

First and foremost, it looks like the database is indeed storing things as UTF8, but the wrong kind of UTF8, correct? Are they going in as raw Unicode, but somehow, maybe because of the sytem variables, it is not being translated to UTF8?

Second, how/why is the MySQL command line client correctly interpreting XY as A?

Finally, to the successful interpretation of the MySQL command line, is there a chart that shows how C3 85 C5 B8 is getting converted to A, or XY is getting converted to A?

Thanks a bunch for any insight.

Deciphering MySQL Encoding

Answers (1)

Related Questions