Reputation: 489
I've got a database containing a list of french word with accents.
My problem is that when I try "select length("é")" it always returns 2. So if I need to get all words with 7 letters, it will never work.
I tried the following :
ALTER DATABASE my_database CHARACTER SET utf8 COLLATE utf8_bin
ALTER TABLE my_table CHARACTER SET utf8 COLLATE utf8_bin
But it did not change anything. I also tried with latin1/latin1_bin and utf16/utf16_bin, no change.
Thank you.
Upvotes: 1
Views: 3175
Reputation: 142208
"select length("é")" it always returns 2.
That implies that you probably already have the schema set correctly. Do not do any ALTERs unless there are other issues.
As further confirmation, SELECT HEX(col) ...
-- you should see C3A9
for é
; that is the hex for the utf8 encoding.
Upvotes: 0
Reputation: 520908
The reason your ALTER
statements are not working is that they only set rules for how newly created tables will encode their text. For your tables which already exist, the ALTER
statements won't change anything.
I found this great blog post which describes how to use iconv
to convert an existing MySQL database from latin1
to utf8
. Here is the command:
mysqldump --add-drop-table my_database | replace CHARSET=latin1
CHARSET=utf8 | iconv -f latin1 -t utf8 | mysql my_database
The other answers which mentioned the distinction between LENGTH()
and CHAR_LENGTH()
are correct and you should also pay attention to this.
Upvotes: 0
Reputation: 29
The LENGTH function returns the number of bytes.
You need to use the CHAR_LENGTH() function, it returns the number of characters regardless of the number of bytes.
Upvotes: 1
Reputation: 6564
Use CHAR_LENGTH()
instead of LENGTH()
. LENGTH
returns the length in bytes which are not always 1 in utf-8.
Upvotes: 1