Dea Ananda
Dea Ananda

Reputation: 115

Delete duplicate rows with an order by

I have this table:

table DAILYDATAWH

I want to delete duplicate rows in that table based on different STATUSIN and this is my query to duplicate rows:

;WITH CTE AS
(
   SELECT ID,NIP, ROW_NUMBER()OVER(PARTITION BY STATUSIN ORDER BY STATUSIN) AS RowNumber
   FROM DAILYDATAWH
), CTE2 AS
(
SELECT TOP (1000) * 
    FROM CTE
    ORDER BY RowNumber DESC
    )
    DELETE FROM CTE2 WHERE RowNumber > 1

and this is the output:

delete duplicate rows

how to delete duplicate rows and show the output like this:

output

Upvotes: 0

Views: 222

Answers (4)

Akash Patel
Akash Patel

Reputation: 339

In your particular scenario logic which has been written will not work work because if you closely look at the output of your CTE you will always have RowNumber as 1.

You query would be somewhat like this.

DECLARE @Temp AS TABLE
(
    ID          INT IDENTITY(1,1)
    ,NIP        VARCHAR(2)
    ,[NAME]     VARCHAR(10)
    ,DEPARTMENT VARCHAR(4)
    ,STATUSIN   DATETIME
)

INsERT INTO @Temp
(
     NIP
    ,[NAME]
    ,DEPARTMENT
    ,STATUSIN
)
VALUES
 ('A1','ARIA','BB',GETDATE())
,('A1','ARIA','BB',GETDATE())
,('A1','ARIA','BB',DATEADD(MINUTE,-1,GETDATE()))
,('A1','ARIA','BB',DATEADD(MINUTE,-1,GETDATE()))
,('A2','CHLOE','BB',DATEADD(MINUTE,-2,GETDATE()))
,('A2','CHLOE','BB',DATEADD(MINUTE,-3,GETDATE()))
,('A2','CHLOE','BB',DATEADD(MINUTE,-3,GETDATE()))
,('A3','Test','BB',DATEADD(MINUTE,-6,GETDATE()))

;WITH CTE AS
(
   SELECT 
         NIP
        ,[NAME]
        ,ID         = MAX(Id)
        ,STATUSIN
        ,ROW_NUMBER()OVER(PARTITION BY [Name] ORDER BY STATUSIN) AS RowNumber
   FROM @Temp
   GROUP BY
        NIP
        ,[NAME]     
        ,STATUSIN
)
SELECT * -- To do a delete change this line to DELETE T 
FROM 
    @Temp AS T
    LEFT OUTER JOIN CTE ON T.ID = CTE.ID
WHERE
    CTE.ID IS NULL
ORDER BY 
    T.[NAME]
    ,T.STATUSIN

I have only written select which will display the records needs to be deleted. You can verify the results by changing CTE.ID IS NULL to IS NOT NULL.

I hope this will help... Good Luck

Upvotes: 2

Yarner
Yarner

Reputation: 115

You missed ID in the partition Order by. This produces what you want as you are always deleting the 2nd , 3rd .. duplicate. Anchor your query on the first instance as below.

SELECT 
ROW_NUMBER()OVER(PARTITION BY STATUSIN ORDER BY ID, STATUSIN) AS RowNumber,
ID, NIP, Name,DEPARTMENT,STATUSIN,STATUSOUT FROM #DAILYDATAWH

I would also mention that you should probably strengthen how you are partitioning. What happens if you get multiple customer records with the exact same time stamp? EG is NIP + Name unique? Added name to the below example.

SELECT 
ROW_NUMBER()OVER(PARTITION BY Name,STATUSIN ORDER BY Name, STATUSIN) AS RowNumber,
ID, NIP, Name,DEPARTMENT,STATUSIN,STATUSOUT FROM #DAILYDATAWH

Solution for your query

;WITH CTE AS
(
  SELECT 
ROW_NUMBER()OVER(PARTITION BY STATUSIN ORDER BY ID, STATUSIN) AS RowNumber
,
ID, NIP, Name,DEPARTMENT,STATUSIN,STATUSOUT FROM #DAILYDATAWH
), CTE2 AS
(
SELECT TOP (1000) * 
    FROM CTE
    ORDER BY RowNumber DESC
    )
    DELETE FROM CTE2 WHERE RowNumber > 1

    SELECT * FROM #DAILYDATAWH

Upvotes: 1

George Menoutis
George Menoutis

Reputation: 7240

Your code includes:

PARTITION BY STATUSIN ORDER BY STATUSIN

Having the same columns in the partition by and order by makes no sense. You say:

I want to delete duplicate rows in that table based on different STATUSIN

Good, you have defined what the ORDER BY should be. This decides which row to keep among duplicates.

The PARTITION BY part must includes the columns that define which columns are duplicate. A guess would be NIP,NAME, but you have to decide for yourself. So try something like this:

ROW_NUMBER() OVER(PARTITION BY NIP,NAME ORDER BY STATUSIN) AS RowNumber

The rest of the code seems ok to me.

Upvotes: 0

Atk
Atk

Reputation: 752

  1. ID, i guess is a primary key and if you want to rearrange records in your main table, it is not possible because it is stored in ascending order.
  2. If you just want to show records like in the below picture, then after deleting the records, use
select * from DAILYDATAWH order by NIP,NAME,StatusIn

Upvotes: 0

Related Questions