Reputation: 39
I am trying to set up a database to store string data that is in multiple languages and includes Chinese letters among many others.
Steps I have taken so far:
I have created a schema which uses utf8mb4 character set and utf8mb4_unicode_ci collation.
I have created a table which includes CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; at the end of the CREATE statement.
I am attempting to LOAD DATA INFILE from a CSV file with CHARACTER SET utf8mb4 specified in the LOAD statement.
However, I am receiving an error Error Code: 1366. Incorrect string value: '\xCE\x09DIS' for column 'company_name' at row 43630.
Upvotes: 0
Views: 606
Reputation: 142278
Did it successfully parse 43629 rows? Then croak on that row? It may actually be garbage in the file.
Do you know what that company name should be? What does the rest of the line say?
Do you have another example? Remove that one line and run the LOAD
again.
CE
can be interpreted by any 1-byte charset, but not necessarily in a meaningful way.
09
is the "tab" character in virtually all charsets; is it reasonable to have a tab in a company name??
Upvotes: 1