BeanJunkie
BeanJunkie

Reputation: 13

Merge with primary key violation

I have a file based import thingy where the users can post files to be imported in the database. New records are inserted and records with an already existing Id are updated.

After posting a file with ID NAME 5 Silly

they can correct this by posting a new file with ID NAME 5 Sally

I have a bulk insert (C# windows service) of the file into a bulk table (Sql Server Azure v12). The files can contain millions of rows so I'd like to avoid iterating through rows. After the bulk insert i have a SP that does a merge update / insert and updates already existing rows and inserts new ones.

The problem I've come across is when the users post a new record and a correction of the same record in the same file. I get a PRIMARY KEY VIOLATION on the target table.

Is there a nice way to solve this?

Here's an example:

--drop table #bulk --drop table #target create table #bulk( id int, name varchar(10) )

insert into #bulk values (1,'John') insert into #bulk values (2,'Sally') insert into #bulk values (3,'Paul') insert into #bulk values (4,'Gretchen') insert into #bulk values (5,'Penny') insert into #bulk values (5,'Peggy')

create table #target( id int not null, name varchar(10), primary key (id))

merge #target as target using(select id, name from #bulk) as bulktable on target.id = bulktable.id when matched then update set target.name = bulktable.name when not matched then insert(id, name) values (bulktable.id, bulktable.name);

Upvotes: 1

Views: 1400

Answers (1)

t-clausen.dk
t-clausen.dk

Reputation: 44336

This will handle the latest value for name.

You need a new create script for #bulk

CREATE TABLE #bulk
(
  row_id int identity(1,1),
  id int,
  name varchar(10)
)

This is the script you can use with the new bulk table:

;WITH CTE as
(
  SELECT
    id, name, 
    row_number() over (partition by id order by row_id desc) rn
    FROM #bulk
), CTE2 as
(
  SELECT id, name
  FROM CTE
  WHERE rn = 1
)
MERGE #target as target
USING CTE2 as bulktable
on target.id = bulktable.id
WHEN matched and 
 not exists(SELECT target.name except SELECT bulktable.name)
 -- this will handle null values. Otherwise it could simply have been: 
 -- matched and target.name <> bulktable.name
THEN update
SET target.name = bulktable.name
WHEN not matched THEN
INSERT(id, name) VALUES (bulktable.id, bulktable.name);

Upvotes: 1

Related Questions