Mark Hansen
Mark Hansen

Reputation: 1062

MYSQL: West European Characters get Mangled using LOAD DATA LOCAL INFILE

When I run this at the MySQL command line, it works fine:

INSERT INTO MYTABLE VALUES(NULL,101942,'2015-05-08','sähkötupakalle');

The 'ä' and the 'ö' end up in the MySQL varchar column just fine.

However, when I put the same data in a file, and use

LOAD DATA LOCAL INFILE

then the 'ä' and the 'ö' get mangled, and I end up with data in the MySQL varchar column that looks like this:

sähkötupakalle

Any ideas for how I can get these characters to load correctly using "LOAD DATA LOCAL INFILE" ?? FYI, my table has CHARSET=utf8.

Upvotes: 1

Views: 122

Answers (1)

Rick James
Rick James

Reputation: 142208

Apparently the file you are loading is correctly encoded with utf8? But you did not include the CHARACTER SET utf8 clause?

Symptom of "Mojibake": When SELECTing the text, each non-english character is replaced by 2-3 characters that you could call jibberish or garbage.

How you got in the mess:

  • The client's bytes to be INSERTed into the table were encoded as utf8 (good), and
  • The charset for the connection was latin1 (eg, via SET NAMES latin1), and
  • The table column was declared CHARACTER SET latin1

How to fix the text and the table:

Do the 2-step ALTER:

ALTER TABLE Tbl MODIFY COLUMN col VARBINARY(...) ...;
ALTER TABLE Tbl MODIFY COLUMN col VARCHAR(...) ... CHARACTER SET utf8 ...;

where the lengths are big enough and the other "..." have whatever else (NOT NULL, etc) was already on the column.

That converts the column definition while leaving the bits alone.

How to fix the code (in general):

  • Change the client's declaration of charset to utf8 - via SET NAMES utf8 or equivalent.

Upvotes: 1

Related Questions