Reputation: 557
One of my columns has special characters like "Ψ" which latin1_swedish_ci does not seem to support. To handle such cases I tried switching the collation to utf8mb4_bin for the one column. But when converting the data i get:
Query error:
#1366 - Incorrect string value: '\xE4' for column 'title' at row 18
Note: I'm using phpmyadmin to change it. Why am i getting this error? I thought utf8mb4_bin could support it?
Upvotes: 1
Views: 3293
Reputation: 142208
You are in for a rocky road. Hex E4
has nothing to do with Psi
. Where did it come from?
Do SELECT title, HEX(title) ...
on some title that has a non-Ascii character in it.
The UTF-8 encoding for psi is two hex characters CEA8
.
E4
, when interpreted as latin1
represents ä
. Does that make any sense?
CHARACTER SET utf8
(or utf8mb4
). But there is a right way to do that and a way that messes things up worse. Let's see that HEX before discussing which is right.VARIABLEs
that control what encoding the client is using. There are multiple settings that need to change when switching encodings.More references:
No charset maps E4 to Psi:
cp1250, cp1257, dec8, latin1, latin2, latin5, latin7 25 24
'Kaze no Tani no Nausicaä'
cp852 25 24 'Kaze no Tani no Nausicań'
cp850 25 24 'Kaze no Tani no Nausicaõ'
macce 25 24 'Kaze no Tani no Nausicaš'
hp8 25 24 'Kaze no Tani no Nausicað'
greek 25 24 'Kaze no Tani no Nausicaδ'
keybcs2 25 24 'Kaze no Tani no NausicaΣ'
cp1251 25 24 'Kaze no Tani no Nausicaд'
koi8r, koi8u 25 24 'Kaze no Tani no NausicaД'
cp866 25 24 'Kaze no Tani no Nausicaф'
armscii8 25 24 'Kaze no Tani no NausicaՊ'
hebrew 25 24 'Kaze no Tani no Nausicaה'
cp1256 25 24 'Kaze no Tani no Nausicaن'
tis620 26 24 'Kaze no Tani no Nausicaไ'
geostd8 26 24 'Kaze no Tani no Nausicaჰ'
macroman 26 24 'Kaze no Tani no Nausica‰'
So, I worry that two mistakes have been made. Do you have another example of mangled text?
Latin1 cannot handle Greek letters. Nor Cyrillic. Nor Chinese. Etc. So, ä
is "correct"? (I have been chasing how to get between E4 and Psi.)
So you should probably convert to utf8mb4.
ALTER TABLE t CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_520_ci;
will convert all the text columns in table t
.
To change just one column:
ALTER TABLE t MODIFY COLUMN c VARCHAR(...)
CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_520_ci;
It must contain all the other stuff you have not (eg, NULL or NOT NULL).
Upvotes: 4