Ehsan Akbar
Ehsan Akbar

Reputation: 7281

Delete duplicate row with having

I want to remove my duplicate value in SQL Server. So I found a query that can find them, how can I append delete statement to this query to delete them?

(SELECT
    DocumentNumber, LineNumber, SheetNumber, Unit, COUNT(*)
FROM
    Lines
GROUP BY
    DocumentNumber, LineNumber, SheetNumber, Unit
HAVING 
    COUNT(*) > 1)

Upvotes: 1

Views: 60

Answers (3)

Pரதீப்
Pரதீப்

Reputation: 93694

Much appropriate approach using EXISTS

DELETE a 
FROM   lines a 
WHERE  EXISTS (SELECT 1 
               FROM   lines b 
               WHERE  a.documentnumber = b.documentnumber 
                      AND a.linenumber = b.linenumber 
                      AND a.sheetnumber = b.sheetnumber 
                      AND a.unit = b.unit 
               HAVING Count(*) > 1) 

Another approach using COUNT() OVER()

;WITH cte 
     AS (SELECT id, 
                documentnumber, 
                linenumber, 
                sheetnumber, 
                unit, 
                CNT = Count(1) OVER(partition BY documentnumber, linenumber, sheetnumber, unit)
         FROM   dbo.lines) 
DELETE FROM cte 
WHERE  cnt > 1 

Note : Both my approaches deletes all the records for this combination documentnumber, linenumber, sheetnumber, unit if it is duplicated

Upvotes: 1

Tim Schmelter
Tim Schmelter

Reputation: 460028

I would use a CTE and a ranking function like ROW_NUMBER:

;WITH CTE AS
(
   SELECT Id, DocumentNumber, LineNumber, SheetNumber, Unit,
          RN = ROW_NUMBER() OVER (PARTITION BY DocumentNumber, LineNumber, SheetNumber, Unit 
                                  ORDER BY Id DESC) -- keeps the newest Id-row
   FROM dbo.Lines
)
DELETE FROM CTE WHERE RN > 1

The great benefit, it's easier to read and to maintain and it's easy to change it to a SELECT * FROM CTE to see what you're going to delete.

Modify the Order By part to implement a custom logic which row to keep and which rows to delete. It can contain multiple columns(either ASC or DESC) or even conditional statements (f.e. with CASE).

Upvotes: 5

Roman Marusyk
Roman Marusyk

Reputation: 24569

Here are a few ways to do that:

DELETE FROM Lines L
INNER JOIN (SELECT
    DocumentNumber,LineNumber,SheetNumber,Unit, COUNT(*)
FROM
    Lines
GROUP BY
       DocumentNumber,LineNumber,SheetNumber,Unit
HAVING 
    COUNT(*) > 1) D ON L.DocumentNumber = D.DocumentNumber AND L.LineNumber = D.LineNumber AND L.SheetNumber = D.SheetNumber AND L.Unit = D.Unit

or you can also use table variable or CTE or IN and subquery if you have a column like ID

Upvotes: 1

Related Questions