Reputation: 159
We need to write a SQL query to delete all duplicate email entries in a table named Person
, keeping only unique emails based on its smallest Id.
+----+------------------+
| Id | Email |
+----+------------------+
| 1 | [email protected] |
| 2 | [email protected] |
| 3 | [email protected] |
+----+------------------+
Id
is the primary key column for this table.
For example, after running the query, the above Person
table should have the following rows:
+----+------------------+
| Id | Email |
+----+------------------+
| 1 | [email protected] |
| 2 | [email protected] |
+----+------------------+
Note: the output is the whole Person table after executing the SQL. Need to use delete statement.
And here is my answer:
delete Person
from Person p
left join
(select Id, Email
from
(select
Id, Email,
ROW_NUMBER() OVER(PARTITION BY Email ORDER BY Id) AS row_num
from Person) as temp
where row_num = 1) as temptwo on p.Id = temptwo.Id
where
temptwo.Id is null
But after I run my query, it keeps telling me that there is a syntax error in this line:
ROW_NUMBER() OVER(PARTITION BY Email ORDER BY Id) AS row_num
Can someone tell me what's wrong?
Upvotes: 0
Views: 1976
Reputation: 1330
Try This:
SELECT Main.ID,Main.EmaIl_ID FROM @Tab Main
INNER JOIN (
SELECT MIN(ID) AS MIN_ID,Email_ID From @Tab
GROUP BY Email_ID) SUB ON Main.ID=SUB.MIN_ID
ORDER BY Main.ID
Hope this Code works fine for your case.
Upvotes: 0
Reputation: 11
Try this in MySQL:
WITH cte AS (
SELECT p.*, ROW_NUMBER() OVER (PARTITION BY email ORDER BY id) AS seq
FROM Person p)
DELETE FROM Person WHERE id IN (
SELECT id FROM cte
WHERE seq > 1);
Upvotes: 0
Reputation: 1269703
You can just use:
with todelete as (
select p.*,
ROW_NUMBER() OVER (PARTITION BY Email ORDER BY Id) AS seqnum
from Person p
)
delete from todelete
where seqnum > 1;
That is, SQL Server supports updatable CTEs (and subqueries and views), so you don't need to calculate and join back to the table.
EDIT:
In MySQL, you would use:
delete p
from p join
(select email, min(id) as min_id
from person pp
group by email
) pp
on p.email = pp.email and p.id > p.min_id;
Upvotes: 6