Reputation: 115
I have this table:
I want to delete duplicate rows in that table based on different STATUSIN
and this is my query to duplicate rows:
;WITH CTE AS
(
SELECT ID,NIP, ROW_NUMBER()OVER(PARTITION BY STATUSIN ORDER BY STATUSIN) AS RowNumber
FROM DAILYDATAWH
), CTE2 AS
(
SELECT TOP (1000) *
FROM CTE
ORDER BY RowNumber DESC
)
DELETE FROM CTE2 WHERE RowNumber > 1
and this is the output:
how to delete duplicate rows and show the output like this:
Upvotes: 0
Views: 222
Reputation: 339
In your particular scenario logic which has been written will not work work because if you closely look at the output of your CTE you will always have RowNumber as 1.
You query would be somewhat like this.
DECLARE @Temp AS TABLE
(
ID INT IDENTITY(1,1)
,NIP VARCHAR(2)
,[NAME] VARCHAR(10)
,DEPARTMENT VARCHAR(4)
,STATUSIN DATETIME
)
INsERT INTO @Temp
(
NIP
,[NAME]
,DEPARTMENT
,STATUSIN
)
VALUES
('A1','ARIA','BB',GETDATE())
,('A1','ARIA','BB',GETDATE())
,('A1','ARIA','BB',DATEADD(MINUTE,-1,GETDATE()))
,('A1','ARIA','BB',DATEADD(MINUTE,-1,GETDATE()))
,('A2','CHLOE','BB',DATEADD(MINUTE,-2,GETDATE()))
,('A2','CHLOE','BB',DATEADD(MINUTE,-3,GETDATE()))
,('A2','CHLOE','BB',DATEADD(MINUTE,-3,GETDATE()))
,('A3','Test','BB',DATEADD(MINUTE,-6,GETDATE()))
;WITH CTE AS
(
SELECT
NIP
,[NAME]
,ID = MAX(Id)
,STATUSIN
,ROW_NUMBER()OVER(PARTITION BY [Name] ORDER BY STATUSIN) AS RowNumber
FROM @Temp
GROUP BY
NIP
,[NAME]
,STATUSIN
)
SELECT * -- To do a delete change this line to DELETE T
FROM
@Temp AS T
LEFT OUTER JOIN CTE ON T.ID = CTE.ID
WHERE
CTE.ID IS NULL
ORDER BY
T.[NAME]
,T.STATUSIN
I have only written select which will display the records needs to be deleted. You can verify the results by changing CTE.ID IS NULL to IS NOT NULL.
I hope this will help... Good Luck
Upvotes: 2
Reputation: 115
You missed ID in the partition Order by. This produces what you want as you are always deleting the 2nd , 3rd .. duplicate. Anchor your query on the first instance as below.
SELECT
ROW_NUMBER()OVER(PARTITION BY STATUSIN ORDER BY ID, STATUSIN) AS RowNumber,
ID, NIP, Name,DEPARTMENT,STATUSIN,STATUSOUT FROM #DAILYDATAWH
I would also mention that you should probably strengthen how you are partitioning. What happens if you get multiple customer records with the exact same time stamp? EG is NIP + Name unique? Added name to the below example.
SELECT
ROW_NUMBER()OVER(PARTITION BY Name,STATUSIN ORDER BY Name, STATUSIN) AS RowNumber,
ID, NIP, Name,DEPARTMENT,STATUSIN,STATUSOUT FROM #DAILYDATAWH
Solution for your query
;WITH CTE AS
(
SELECT
ROW_NUMBER()OVER(PARTITION BY STATUSIN ORDER BY ID, STATUSIN) AS RowNumber
,
ID, NIP, Name,DEPARTMENT,STATUSIN,STATUSOUT FROM #DAILYDATAWH
), CTE2 AS
(
SELECT TOP (1000) *
FROM CTE
ORDER BY RowNumber DESC
)
DELETE FROM CTE2 WHERE RowNumber > 1
SELECT * FROM #DAILYDATAWH
Upvotes: 1
Reputation: 7240
Your code includes:
PARTITION BY STATUSIN ORDER BY STATUSIN
Having the same columns in the partition by and order by makes no sense. You say:
I want to delete duplicate rows in that table based on different STATUSIN
Good, you have defined what the ORDER BY
should be. This decides which row to keep among duplicates.
The PARTITION BY
part must includes the columns that define which columns are duplicate. A guess would be NIP,NAME
, but you have to decide for yourself. So try something like this:
ROW_NUMBER() OVER(PARTITION BY NIP,NAME ORDER BY STATUSIN) AS RowNumber
The rest of the code seems ok to me.
Upvotes: 0
Reputation: 752
select * from DAILYDATAWH order by NIP,NAME,StatusIn
Upvotes: 0