Reputation: 1940
How can I avoid duplicate using the provided below query
INSERT INTO dbo.Entities(EntityId, [Name], [Description], [Type], Source)
SELECT DISTINCT
CUST_CODE, NAME, FULLDESCRIPTION, 'Agency' AS Type, 'SunDbAgencies' AS Source
FROM dbo.VW_SUNDB_AGENCIES
I've already tried all the answers here :
Avoid duplicates in INSERT INTO SELECT query in SQL Server to no avail.
The duplicates are in the dbo.VW_SUNDB_AGENCIES TABLES
not the INSERT
table, So I gather I need a way to remove duplicate from the select before inserting
Here is one of the duplicates which is why a simple distinct doesn't work:
Upvotes: 0
Views: 984
Reputation: 1271003
The problem is that select distinct
is not sufficient. You still have duplicates in the underlying table, but with different names or descriptions.
I view this as a problem. But, you can work around it by selecting one arbitrary row per cust_code
, using row_number()
:
insert into dbo.Entities (EntityId, [Name], [Description], [Type], Source)
select CUST_CODE, NAME, FULLDESCRIPTION, 'Agency' AS [Type], 'SunDbAgencies' AS Source
from (select a.*,
row_number() over (partitoin by cust_code order by cust_code) as seqnum
from dbo.VW_SUNDB_AGENCIES a
) a
where seqnum = 1 and
not exists (select 1 from dbo.Entities E where A.CUST_CODE = E.EntityId);
Upvotes: 2