Reputation: 37
I have this table and three columns (ID, Email, [Last Update].
I want to delete duplicate emails but keep the last updated row (by date criteria). In this example, aaa
is duplicated. It has 1 row in 2011 and other in 2014. I want to keep just 2014.
ID Email Last update
a-4 aaa 10/01/2011
b-1 bbb 10/02/2012
k-1 ccc 05/03/2013
d-9 aaa 10/08/2014
t-7 bbb 02/09/2015
Upvotes: 3
Views: 6416
Reputation: 42518
To delete the duplicated records for the email field:
DELETE a FROM MyTable a INNER JOIN (
SELECT Email, MAX([Last Update]) [Last Update]
FROM MyTable
GROUP BY Email
) b ON a.Email = b.Email AND a.[Last Update] <> b.[Last Update];
Upvotes: 1
Reputation: 72165
In SQL Server you can use CTE
to perform the DELETE
:
;WITH ToDelete AS (
SELECT ROW_NUMBER() OVER (PARTITION BY Email
ORDER BY [Last update] DESC) AS rn
FROM mytable
)
DELETE FROM ToDelete
WHERE rn > 1
Upvotes: 4
Reputation: 7918
Use the following SQL statement:
SELECT First(ID) AS Id, First(Email) AS Email, Max([Last update]) AS LastUpd
FROM YourTable
GROUP BY Email
ORDER BY Max([Last update]) DESC;
and MAKE TABLE
upon necessity.
Upvotes: 1
Reputation:
You didn't tell us which DBMS you are using, but the following is ANSI SQL and should work on all (modern) DBMS:
delete from the_table
where exists (select id
from the_table t2
where t2.email = the_table.email
and t2.id <> the_table.id
and t2.last_update > the_table.last_update);
SQLFiddle: http://sqlfiddle.com/#!15/ca442/1
Upvotes: 2