Reputation: 501
I'm attempting to get this csv of Russian troll tweets into a mysql database.
I'm trying to use LOAD DATA LOCAL INFILE like this:
LOAD DATA LOCAL INFILE
'/path/to/csv/data.csv'
INTO TABLE
mytable
CHARACTER SET
utf8mb4
FIELDS TERMINATED BY
','
ENCLOSED BY
'"'
LINES TERMINATED BY
'\n'
IGNORE 1 LINES;
It seems to work for a small sample of the data, but when I try to do the full csv, I'm getting this error:
Error 1300 (HY000): Invalid utf8mb4 character string: 'Those who studied history know this is not even considered histo'
The line throwing the error is this one:
4036537452,4MYSQUAD,Those who studied history know this is not even considered history b\с it was pretty recent. #BlackHistoryMonth [shortened link omitted here],United States,English,2/8/2016 23:18,2/8/2016 23:20,4836,2802,1053,,left,0,0,LeftTroll
If use CHARACTER SET latin1, then it imports just fine, but I lose the emojis from the tweets as well as the tweets in Russian.
the csv has tweets in Russian, German, Swedish and emojis. Is there a way to get all these into my database?
Thank you, and let me know if there is any more information I should include in this question.
Upvotes: 2
Views: 4497
Reputation: 501
I ended up doing a massive find/replace to replace every '\' with '\\'.
Worked like a charm. Thanks, marekful and Freddythunder for putting me on the right track.
Upvotes: 1