user3509426
user3509426

Reputation: 23

How to remove ALL duplicates in a database table and NOT KEEP any of them?

I have a database table with the columns: email, name, id

There are duplicate records with the same email and name (ids may or may not be different)

I want to remove all duplicates and not keep any of them.

Example

email            | name         | id
-------------------------------------------------
[email protected]          | a            | 2               --- remove
[email protected]          | a            | 4               --- remove
[email protected]          | b            | 3
[email protected]          | c            | 5

What sql query can I use in phpmyadmin to do this?

Upvotes: 1

Views: 105

Answers (3)

Gordon Linoff
Gordon Linoff

Reputation: 1269773

In MySQL, you should do this with a join:

delete t
   from example t join
        (select email, name
         from example
         group by email, name
         having count(*) > 1
        ) todelete
        on t.email = todelete.email and t.name = todelete.name;

Unfortunately, MySQL does not support simple subqueries on the table being modified in an update or delete statement.

EDIT:

Deleting a lot of records can be a performance bottleneck. To get around this, create a temporary table with the records you want, truncate the original table, and re-insert them.

So, try this:

create temporary table tempt as 
   select t.*
   from example t join
        (select email, name
         from example
         group by email, name
         having count(*) = 1
        ) tokeep
        on t.email = tokeep.email and t.name = tokeep.name;

truncate table example;

insert into example
    select * from tempt;

Try the select query first to be sure it has reasonable performance and does what you want.

Upvotes: 0

vignesh
vignesh

Reputation: 97

DELETE n1 FROM tablename n1, tablename n2 WHERE n1.email = n2.email

Upvotes: -1

Tim Schmelter
Tim Schmelter

Reputation: 460118

You could use EXISTS:

DELETE FROM TableName t1
WHERE EXISTS
(
    SELECT 1 FROM TableName t2
    WHERE t1.id <> t2.id
      AND COALESCE(t1.email,'') = COALESCE(t2.email,'')
      AND COALESCE(t1.name,'')  = COALESCE(t2.name,'')     
)

I've used COALESCE to also delete duplicates if the emails or names are null.

Upvotes: 2

Related Questions