Reputation: 1192
I have a stored procedure that does an insert like this one:
INSERT INTO [TargetTable] ([Hash], [Year], [Metadata])
SELECT [Hash], [Year], [Metadata]
FROM [SourceTable]
This query will fail because the [TargetTable]
has data already and an unique index on the [Hash]
column. It cannot insert duplicates.
How I can insert only the non-duplicates? I guess I could select first only the non-duplicates and then do the inserts. But the select would be more complex.
Another approach I tried is to ignore the duplicates. For instance:
BEGIN TRY
INSERT INTO [TargetTable] ([Hash], [Event], [Year], [Metadata])
SELECT [Hash], [Event], [Year], [Metadata]
FROM [SourceTable]
END TRY
BEGIN CATCH
SELECT ERROR_NUMBER() AS ErrorNumber, ERROR_MESSAGE() AS ErrorMessage;
END CATCH
I was expecting something like, try to insert this, if you cannot insert it, then ignore it, and then continue inserting the rest of rows.
Upvotes: 1
Views: 1217
Reputation: 1271231
If no other inserts are happening at the same time, then this should work:
INSERT INTO [TargetTable] ([Hash], [Year], [Metadata])
SELECT st.[Hash], st.[Year], st.[Metadata]
FROM (SELECT st.*,
ROW_NUMBER() OVER (PARTITION BY hash ORDER BY (SELECT NULL)) as seqnum
FROM SourceTable st
) st
WHERE seqnum = 1 AND -- ensure duplicates are not in source table
NOT EXISTS (SELECT 1 FROM TargetTable tt WHERE tt.hash = st.hash);
Upvotes: 2