NaturalBornCamper
NaturalBornCamper

Reputation: 3866

MySQL bulk INSERT .. ON DUPLICATE KEY UPDATE while specifying individual columns

I need to bulk insert thousands of rows if a table from 2 different cronjobs with different datasets, I'm currently using INSERT .. ON DUPLICATE KEY UPDATE

Considering the first value is the unique key to check for duplicates,

First cronjob would for example run (from data source 1):

INSERT INTO `places` () VALUES
 ('aaa', 'bbb', NULL),
 ('ccc', NULL, 'ddd')
ON DUPLICATE KEY UPDATE;

Second cronjob would for example run (from data source 2):

INSERT INTO `places` () VALUES
 ('aaa', NULL, 'eee'),
 ('ccc', 'ddd', 'fff')
ON DUPLICATE KEY UPDATE

As you can see, the second cronjob would overwrite valid values from the first cronjob with null values, which I don't want.
If I remove the null values from the data, the query would break as there would be a data mismatch.

The only solutions I can think of is to execute LOTS of single insert/updates, which would be slower.

If I could do a bulk

INSERT INTO `places` VALUES
 (column1='aaa', column3='eee'),
 (column1='ccc', column2='ddd', column3='fff')
ON DUPLICATE KEY UPDATE

then I wouldn't have to worry about keys anyone else but I don't think it's possible in MySQL.. does anyone know of a better way to do this?

Upvotes: 2

Views: 969

Answers (1)

Bill Karwin
Bill Karwin

Reputation: 562328

I tested the following and it seems to work:

INSERT INTO `places` (col1, col2, col3) VALUES
 ('aaa', NULL, 'eee'),
 ('ccc', 'ddd', 'fff')
ON DUPLICATE KEY UPDATE
 col1=COALESCE(VALUES(col1), col1),
 col2=COALESCE(VALUES(col2), col2),
 col3=COALESCE(VALUES(col3), col3);

In an ON DUPLICATE KEY statement, the VALUES(<col>) means "whatever I tried to insert in each row." So in the first row above, VALUES(col2) is NULL, and col2 is the value already stored on the row for that primary key.

The COALESCE() function returns its first non-NULL argument. So COALESCE(NULL, col2) returns the existing value for col2.

You do need to write out all the clauses for all these columns if you want to have them "default to previous" in this way.

P.S.: Your use of INSERT INTO places () is incorrect. The empty list of columns () must correspond to an empty list of values. I.e. the two lists must have the same number of terms.

Upvotes: 2

Related Questions