Reputation: 8487
When insert emoji character in mysql interactive interface, I found some phenomena very confusing. Hope someone could clear it. Now see below:
mysql> show variables like 'character%';
+--------------------------+---------------------------------------+
| Variable_name | Value |
+--------------------------+---------------------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /opt/mysql/server-5.6/share/charsets/ |
+--------------------------+---------------------------------------+
CREATE TABLE `t` (
`data` varchar(100) CHARACTER SET utf8mb4 DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
mysql> insert into t select '\U+1F600';
ERROR 1366 (HY000): Incorrect string value: '\xF0\x9F\x98\x80' for column 'data' at row 1
mysql> set names utf8mb4;
mysql> insert into t select '\U+1F600';
Query OK, 1 row affected (0.00 sec)
mysql> select * from t;
+------+
| data |
+------+
| 😀 |
+------+
mysql> select data, hex(data) from t;
+------+-----------+
| data | hex(data) |
+------+-----------+
| 😀 | F09F9880 |
+------+-----------+
Why do I need execute set names utf8mb4 explicitly? From error message, it seems it resolved the data content to four byte(f0 9f 98 80) successully? Why still can't insert successfully?
Below is another puzzle for me.
mysql> show variables like 'character%';
+--------------------------+---------------------------------------+
| Variable_name | Value |
+--------------------------+---------------------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /opt/mysql/server-5.6/share/charsets/ |
+--------------------------+---------------------------------------+
mysql> insert into t select '\U+1F600';
Query OK, 1 row affected (0.01 sec)
mysql> select data,hex(data) from t;
+------+--------------------+
| data | hex(data) |
+------+--------------------+
| 😀 | C3B0C5B8CB9CE282AC |
+------+--------------------+
I have to say I feel a little shock about this. In my opinion only utf8mb4 support emoji character, but now latin1 support emoji character too. Anybody can clear it for me. Thanks!
Upvotes: 0
Views: 401
Reputation: 1388
You can insert UTF8 data into a latin1 table, but MySQL won't treat the byte stream as a UTF8 character. So you won't be able to query against it for example. If your application understands the UTF8 byte stream then it will look like its working OK. But the table charset really needs to be utf8 (or utf8mb4) if MySQL is to understand those bytes as Unicode characters.
Upvotes: 0