Reputation: 71
I have a question about SQL Server.
Table: emp
empid | name |sal
1 | abc |100
2 | def |200
3 | test |300
2 | har |500
3 | jai |600
4 | kali |240
This table has duplicate data based on above table I want delete duplicate data from the emp
table
And duplicate data should be loaded into empduplicate
table.
Here empid
is unique. If empid
shows up multiple times, then that record is considered a duplicate.
empduplicate
structure look like this:
Empid | name | sal
Finally after deleting duplicate data, I want see data in emp
table look like this:
empid | name | sal
1 | abc | 100
4 | kali | 240
For deleting the duplicates, I tried this code:
;with duplicate as
(
select
*,
row_number()over (partition by empid order by empid) as rn
from emp
)
delete from duplicate
where rn > 1
But I am not able to delete entire records.
Example: empid=2
has duplicate data
empid|name |sal
2 |def |200
2 |har |500
I need to delete entire empid=2
corresponding records. empid=2
has duplicate and need to delete it from the emp
table.
And empduplicate
table need to load duplicate data look like this:
empid | name |sal
2 |def |200
2 |har |500
3 |test |300
3 |jai |600
For inserting the duplicate data I tried this code:
insert into empduplicate
select
id, name, sal
from
emp
group by
id
having
count(*) > 1
That query throws an error:
Column 'duplicate.name' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Please tell me how to write query to achieve my task in SQL Server
Upvotes: 5
Views: 347
Reputation: 1
BEGIN TRAN
SELECT * INTO empduplicate FROM
(
SELECT *
FROM emp
WHERE empid IN (
SELECT empid FROM emp
GROUP BY empid
HAVING COUNT(empid)>1
)
) as M
DELETE FROM emp WHERE empid IN (
SELECT empid FROM emp
GROUP BY empid
HAVING COUNT(empid)>1
)
COMMIT TRAN
Upvotes: 0
Reputation: 347
SELECT DISTINCT * INTO #tmp FROM emp
DELETE FROM emp
INSERT INTO emp
SELECT * FROM #tmp DROP table #tmp
SELECT * FROM emp ---------------------------- All Distinct ID
SELECT * INTO #tmp FROM emp
WHERE empid in(
SELECT empid FROM emp
group by empid having count(*) = 1
)
DELETE FROM emp
INSERT INTO emp
SELECT * FROM #tmp DROP table #tmp
SELECT * FROM emp ----------------------------All ID which is not duplicate
INSERT INTO empduplicate
SELECT * FROM emp where empid in(
SELECT empid FROM emp
group by empid having count(*) >1
)
SELECT * FROM empduplicate -------------------ALL Duplicate value.
Upvotes: 0
Reputation: 31879
You're almost there. Instead of using ROW_NUMBER
, use COUNT
:
WITH CteInsert AS(
SELECT *,
cnt = COUNT(empid) OVER(PARTITION BY empid)
FROM emp
)
INSERT INTO empduplicate(empid, name, sal)
SELECT
empid, name, sal
FROM CteInsert
WHERE cnt > 1;
WITH CteDelete AS(
SELECT *,
cnt = COUNT(empid) OVER(PARTITION BY empid)
FROM emp
)
DELETE FROM CteDelete WHERE cnt > 1;
You need to execute the INSERT
first before the DELETE
. Additionally, you might want to enclose this in a single transaction.
Upvotes: 2