georgedum
georgedum

Reputation: 501

LOAD DATA LOCAL INFILE - Invalid utf8mb4 character string

I'm attempting to get this csv of Russian troll tweets into a mysql database.

I'm trying to use LOAD DATA LOCAL INFILE like this:

LOAD DATA LOCAL INFILE
'/path/to/csv/data.csv' 
INTO TABLE
mytable
CHARACTER SET
utf8mb4
FIELDS TERMINATED BY 
','
ENCLOSED BY 
'"'
LINES TERMINATED BY
'\n'
IGNORE 1 LINES;

It seems to work for a small sample of the data, but when I try to do the full csv, I'm getting this error:

Error 1300 (HY000): Invalid utf8mb4 character string: 'Those who studied history know this is not even considered histo'

The line throwing the error is this one:

4036537452,4MYSQUAD,Those who studied history know this is not even considered history b\с it was pretty recent. #BlackHistoryMonth [shortened link omitted here],United States,English,2/8/2016 23:18,2/8/2016 23:20,4836,2802,1053,,left,0,0,LeftTroll

If use CHARACTER SET latin1, then it imports just fine, but I lose the emojis from the tweets as well as the tweets in Russian.

the csv has tweets in Russian, German, Swedish and emojis. Is there a way to get all these into my database?

Thank you, and let me know if there is any more information I should include in this question.

Upvotes: 2

Views: 4497

Answers (1)

georgedum
georgedum

Reputation: 501

I ended up doing a massive find/replace to replace every '\' with '\\'.

Worked like a charm. Thanks, marekful and Freddythunder for putting me on the right track.

Upvotes: 1

Related Questions