Kempe
Kempe

Reputation: 23

Removing duplicate rows based on one column same values but keep one record

SQL Server Version

Remove all dupe rows (row 3 thru 18) with service_date = '2018-08-29 13:05:00.000' but keep the oldest row (row 2) and of course keep row 1 since its different service_date. Don't mind the create_timestamp or document_file since it's the same customer. Any idea?

Click here for example

Upvotes: 2

Views: 106

Answers (3)

Sahil Anand
Sahil Anand

Reputation: 139

You don't need to use Partition function.please use the below query for efficient performance.i have tested its working fine.

with result as
        (
        select *, row_number() over(order by create_timestamp) as Row_To_Delete from TableName
        )
        delete from result where result.Row_To_Delete>2

Upvotes: 1

Eralper
Eralper

Reputation: 6612

I think you will want to remove these data per customer basis

I mean, if customers are different you will want to keep the entries even on the same date

If you you will require the addition of Customer column in partition by clause used to identify duplicate rows in SQL

By copying and modifying Tim's solution, you can check following

;WITH cte AS (
    SELECT *,
        ROW_NUMBER() OVER (PARTITION BY customer, service_date ORDER BY create_timestamp) rn
    FROM yourTable
)    
DELETE
FROM cte
WHERE rn > 1;

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521239

In SQL Server, we can try deleting using a CTE:

WITH cte AS (
    SELECT *,
        ROW_NUMBER() OVER (PARTITION BY service_date ORDER BY create_timestamp) rn
    FROM yourTable
)

DELETE
FROM cte
WHERE rn > 1;

The strategy here is to assign a row number to each group of records sharing the same service_date, with 1 being assigned to the oldest record in that group. Then, we can phrase the delete by just targeting all records which have a row number greater than 1.

Upvotes: 7

Related Questions