user1980494
user1980494

Reputation: 1

LOAD DATA LOCAL INFILE query

I have,

<pre lang="SQL">LOAD DATA LOCAL INFILE 'C:\Users\mydesk\Desktop\dcc.csv' INTO TABLE DATAS.TABLE IN FIELDS TERMINATED BY ',' ENCLOSED BY '"' lines terminated by '\n'</pre>

query in my program, to upload data from csv file to mysql. But there will be "Duplicate entry for PRIMARY KEY" error in some rows and. How can I catch those exception in each row and handle with different query?

For example I have 10 rows in csv file, and some of the rows are same as in DATAS.TABLE. So When such records coming for insert into the table DATAS.TABLE, I need to execute update query instead.

Upvotes: 0

Views: 836

Answers (3)

peterm
peterm

Reputation: 92835

I'd suggest

  1. create a staging table without constraints
  2. load data into that table first
  3. clean, sanitize, validate (including eliminating duplicates in PK), transform your data
  4. insert clean data to your factual table(s)
  5. truncate staging table

With that approach you can do multi-step complex validation, join with reference tables and/or factual table(s) to find inconsistencies or other problems.

In that case finding duplicates is just a matter of simple SELECT on the staging table.

Upvotes: 1

Quicksilver
Quicksilver

Reputation: 2710

From MySQL doc:

The REPLACE and IGNORE keywords control handling of input rows that duplicate existing rows on unique key values:

If you specify REPLACE, input rows replace existing rows. In other words, rows that have the same value for a primary key or unique index as an existing row. See Section 13.2.7, “REPLACE Syntax”.

If you specify IGNORE, input rows that duplicate an existing row on a unique key value are skipped. If you do not specify either option, the behavior depends on whether the LOCAL keyword is specified. Without LOCAL, an error occurs when a duplicate key value is found, and the rest of the text file is ignored. With LOCAL, the default behavior is the same as if IGNORE is specified; this is because the server has no way to stop transmission of the file in the middle of the operation.

So if you need to update the row on duplicate values mention REPLACE in your query.

Upvotes: 1

Tony Shih
Tony Shih

Reputation: 436

Can make a suggestion load the data from csv to a temp table in first pass.

Then use this temp table to process data in two steps (1) update existing rows from temp table and (2) insert new rows from temp table in the second pass.

Upvotes: 1

Related Questions