Kumaran T
Kumaran T

Reputation: 677

How to eliminiate duplicates rows in table of sql server 2005 database?

i am using sql server 2005. i have a table contains duplicated rows. how can i eliminated those duplicate rows in that table? for e.g., the table may contain 3 similar rows in which i want to delete 2 rows and keep the original

Upvotes: 1

Views: 546

Answers (3)

Hunter
Hunter

Reputation: 2470

For understanding purpose, lets take a simple table Employee with below schema

EmployeeId - int
EmployeeName    varchar(50)
Age int

Lets populate with duplicate values. Please note primary key is not duplicated in this case

INSERT INTO Employee(EmployeeId,EmployeeName,Age) VALUES (1,'Mark',20)
INSERT INTO Employee(EmployeeId,EmployeeName,Age) VALUES (2,'Tom',22)
INSERT INTO Employee(EmployeeId,EmployeeName,Age) VALUES (3,'Sam',24)
INSERT INTO Employee(EmployeeId,EmployeeName,Age) VALUES (4,'Mark',20)
INSERT INTO Employee(EmployeeId,EmployeeName,Age) VALUES (5,'Tom',22)
INSERT INTO Employee(EmployeeId,EmployeeName,Age) VALUES (6,'Tom',22)
GO 

we can make use of CTE in finding the duplicate rows. Gather duplicate rows by using Group by/Count statement. Once Duplicate rows are Identified, we select those rows from the main table using join condition. Now rank those Rows and delete all the rows apart from the rows with rank 1. I find this a lot more elegant.

WITH TotalDuplicates(EmployeeName,Age,Total) AS 
(
    SELECT EmployeeName,Age,COUNT(employeeId)  AS Total  FROM Employee 
    GROUP BY EmployeeName,Age
    HAVING COUNT(employeeId) > 1
)
,DistinctRows(EmployeeId,EmployeeName,Age) AS 
(
    SELECT E.EmployeeId,E.EmployeeName,E.Age FROM Employee AS E
    INNER JOIN TotalDuplicates AS D 
    ON (E.EmployeeName = D.EmployeeName AND E.Age = D.Age)
)
,OrderedDuplicateTables(EmployeeId,EmployeeName,Age,Ranking) AS 
(
    SELECT 
        EmployeeId,
        EmployeeName,
        Age,
        RANK() OVER (PARTITION BY EmployeeName, Age ORDER BY EmployeeId DESC) 
    FROM DistinctRows 
)

DELETE FROM Employee
WHERE EmployeeId IN (SELECT EmployeeId FROM OrderedDuplicateTables WHERE Ranking > 1)

Upvotes: 0

Pankaj Agarwal
Pankaj Agarwal

Reputation: 11309

First you can copy duplicate record into another table like as following way

Select fieldnames into #temp from table1 group by fieldnames having count (*) > 1

then remove that record from original table

delete from table1 where fieldname in (select fieldnames from #temp)

and finally copy record from temporary table to original source table.

insert table1 select * from #temp

through above mentioned steps you can eliminate duplicate record from table.

Upvotes: 0

franz976
franz976

Reputation: 333

If you just want to eliminate duplicate records from your result set, you can use the DISTINCT command:

SELECT DISTINCTI field1, field2 FROM...

If you want to delete those duplicate records, you can use COUNT to detect which records have more than one instance, and then deleting them with a subquery

Upvotes: 1

Related Questions