Reputation: 384
I have to migrate a database from Oracle to MySql having billions of data. I found a strategy to create a schema and get the data in CSV from Oracle and load data to MySQL. I have created a CSV file with a delimiter of a quote(") and field terminated with a comma(,). Now the problem is that CSV file contains special character which is not going to be imported in MySql.
I am using the command:
LOAD DATA LOCAL infile 'C:/Users/NNCP4659/export.csv' INTO TABLE dbk_address_master
CHARACTER SET utf8 fields terminated BY "," enclosed by '"' lines terminated
BY "\r\n"(id, country_code,address,city_id,latitude,longitude,
@is_active,google_address,old_address,building_number,street_name,created_by)
set is_active=cast(@is_active as signed);
My data is like:
4113973,"CHE","167 Bernerstrasse Süd","57066","47.3943271","8.4865849",1,"Bernerstrasse Süd 167, 8048 Zürich,
Switzerland","167 Bernerstrasse Süd","Y","167","Bernerstrasse Süd","migration"
And error is:
ERROR 1300 (HY000): Invalid utf8 character string: '"167 Bernerstrasse S'
Upvotes: 0
Views: 845
Reputation: 142296
167 Bernerstrasse S
looks like the truncation of 167 Bernerstrasse Süd
at the first non-utf8 character.
You have specified that the incoming data is utf8 via
LOAD DATA ... CHARACTER SET utf8 ...
I conclude that the incoming file is not encoded correctly. It is probably latin1, in which case the hex would be FC
. Assuming this is the case, you should switch to
LOAD DATA ... CHARACTER SET latin1 ...
It does not matter if the CHARACTER SET
in the target column is not latin1; MySQL will transcode it in flight.
(Alternatively, you could change the incoming data to have utf8 (hex: C3BC
), but that may be more hassle.)
Reference: "truncated" in Trouble with UTF-8 characters; what I see is not what I stored
(As for how to check the hex, or do SHOW CREATE TABLE
, we need to know what OS you are using and what tools you have available.)
Upvotes: 1