Sara
Sara

Reputation: 489

Encoding in MySQL with french accents

I've got a database containing a list of french word with accents.

My problem is that when I try "select length("é")" it always returns 2. So if I need to get all words with 7 letters, it will never work.

I tried the following :

ALTER DATABASE my_database CHARACTER SET utf8 COLLATE utf8_bin
ALTER TABLE my_table CHARACTER SET utf8 COLLATE utf8_bin

But it did not change anything. I also tried with latin1/latin1_bin and utf16/utf16_bin, no change.

Thank you.

Upvotes: 1

Views: 3175

Answers (4)

Rick James
Rick James

Reputation: 142208

"select length("é")" it always returns 2.

That implies that you probably already have the schema set correctly. Do not do any ALTERs unless there are other issues.

As further confirmation, SELECT HEX(col) ... -- you should see C3A9 for é; that is the hex for the utf8 encoding.

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 520908

The reason your ALTER statements are not working is that they only set rules for how newly created tables will encode their text. For your tables which already exist, the ALTER statements won't change anything.

I found this great blog post which describes how to use iconv to convert an existing MySQL database from latin1 to utf8. Here is the command:

mysqldump --add-drop-table my_database | replace CHARSET=latin1
    CHARSET=utf8 | iconv -f latin1 -t utf8 | mysql my_database

The other answers which mentioned the distinction between LENGTH() and CHAR_LENGTH() are correct and you should also pay attention to this.

Upvotes: 0

wokati
wokati

Reputation: 29

The LENGTH function returns the number of bytes.

You need to use the CHAR_LENGTH() function, it returns the number of characters regardless of the number of bytes.

Upvotes: 1

i486
i486

Reputation: 6564

Use CHAR_LENGTH() instead of LENGTH(). LENGTH returns the length in bytes which are not always 1 in utf-8.

Upvotes: 1

Related Questions