nardowick
nardowick

Reputation: 111

Delete duplicate entries while keeping one

I have a table but it has no unique ID or primary key.

It has 3 columns in total.

name user_id role_id
ben 1 2
ben 1 2
sam 1 3

I'd like to remove one entry with the name Ben.

So output would look like this

name user_id role_id
ben 1 2
sam 1 3

Most of the examples shows deleting duplicate entries with ID or primary key. However how would I retain one entry whilest removing the other ones?

Using the following query I was able to get duplicated rows

SELECT name, user_id, role_id, count(*) FROM some_table
GROUP BY name, user_id, role_id
HAVING count(*) > 1

To clarify, I am looking to delete these rows.

Prefer not creating a new table.

Upvotes: 4

Views: 4144

Answers (3)

ysth
ysth

Reputation: 98508

Note that you are not saving anything by not having a primary key; mysql (at least with innodb) requires a primary key and will create a hidden one if you do not have one. So I would first add a primary key:

alter table some_table add id serial primary key;

Then you can easily remove duplicates with:

delete a from some_table a join some_table b on a.name=b.name and a.user_id=b.user_id and a.role_id=b.role_id and b.id < a.id;

Upvotes: 1

user1191247
user1191247

Reputation: 12973

If you don't have to worry about other users accessing the table -

CREATE TABLE `new_table` AS
SELECT DISTINCT `name`, `user_id`, `role_id`
FROM `old_table`;

RENAME TABLE
    `old_table` TO `backup`,
    `new_table` TO `old_table`;

Or you could use your duplicates query to output lots of single row delete queries -

SELECT
    `name`,
    `user_id`,
    `role_id`,
    COUNT(*),
    CONCAT('DELETE FROM some_table WHERE name=\'', `name`, '\' AND user_id=\'', `user_id`, '\' AND role_id=\'', `role_id`, '\' LIMIT 1;') AS `delete_stmt`
FROM `some_table`
GROUP BY `name`, `user_id`, `role_id`
HAVING COUNT(*) > 1;

Or you could temporarily add a SERIAL column and then remove it after the delete -

ALTER TABLE `some_table` ADD COLUMN `temp_id` SERIAL;

DELETE `t1`.* 
FROM `some_table` `t1`
LEFT JOIN (
    SELECT MIN(`temp_id`) `min_temp_id`
    FROM `some_table`
    GROUP BY `name`, `user_id`, `role_id`
) `t2` ON `t1`.`temp_id` = `t2`.`min_temp_id`
WHERE `t2`.`min_temp_id` IS NULL;

ALTER TABLE `some_table` DROP COLUMN `temp_id`;

Upvotes: 2

DetroitMike
DetroitMike

Reputation: 124

I would take the duplicate records and put them into another table.

SELECT  
   name, 
   user_id, 
   role_id
   INTO some_new_table
FROM some_table
GROUP BY name, user_id, role_id
HAVING count(*) > 1

Then you can delete those records from your source table

DELETE a
FROM some_table a
INNER JOIN some_new_table b
  ON a.name = b.name
    AND a.user_id = b.user_id
    AND a.role_id = b.role_id

Finally you can then insert the deduped records back into your table.

INSERT INTO some_table 
SELECT 
  name, 
  user_id, 
  role_id
FROM some_new_table 

If the volume of dupes is very large you could also just create a new table with the deduped data. Truncate \ Drop the old table and then Insert \ Rename from the new table.

Upvotes: 0

Related Questions