Raul
Raul

Reputation: 37

Delete duplicate rows but keep the earliest row (by date criteria) - SQL Server

I have this table and three columns (ID, Email, [Last Update].

I want to delete duplicate emails but keep the last updated row (by date criteria). In this example, aaa is duplicated. It has 1 row in 2011 and other in 2014. I want to keep just 2014.

ID       Email          Last update         
a-4        aaa           10/01/2011 
b-1        bbb           10/02/2012    
k-1        ccc           05/03/2013    
d-9        aaa           10/08/2014
t-7        bbb           02/09/2015

Upvotes: 3

Views: 6416

Answers (4)

Florent B.
Florent B.

Reputation: 42518

To delete the duplicated records for the email field:

DELETE a FROM MyTable a INNER JOIN (
  SELECT Email, MAX([Last Update]) [Last Update]
  FROM MyTable
  GROUP BY Email
) b ON a.Email = b.Email AND a.[Last Update] <> b.[Last Update];

Upvotes: 1

Giorgos Betsos
Giorgos Betsos

Reputation: 72165

In SQL Server you can use CTE to perform the DELETE:

;WITH ToDelete AS (
   SELECT ROW_NUMBER() OVER (PARTITION BY Email         
                             ORDER BY [Last update] DESC) AS rn
   FROM mytable
)
DELETE FROM ToDelete
WHERE rn > 1

Upvotes: 4

Alexander Bell
Alexander Bell

Reputation: 7918

Use the following SQL statement:

SELECT First(ID) AS Id, First(Email) AS Email, Max([Last update]) AS LastUpd
FROM YourTable
GROUP BY Email
ORDER BY Max([Last update]) DESC;

and MAKE TABLE upon necessity.

Upvotes: 1

user330315
user330315

Reputation:

You didn't tell us which DBMS you are using, but the following is ANSI SQL and should work on all (modern) DBMS:

delete from the_table
where exists  (select id
               from the_table t2
               where t2.email = the_table.email
                 and t2.id <> the_table.id
                 and t2.last_update > the_table.last_update);

SQLFiddle: http://sqlfiddle.com/#!15/ca442/1

Upvotes: 2

Related Questions