Avadhesh
Avadhesh

Reputation: 4703

Delete duplicate rows (don't delete all duplicate)

I am using postgres. I want to delete Duplicate rows. The condition is that , 1 copy from the set of duplicate rows would not be deleted.

i.e : if there are 5 duplicate records then 4 of them will be deleted.

Upvotes: 16

Views: 14988

Answers (5)

Xpie
Xpie

Reputation: 43

DELETE f1 from foo as f1, foo as f2 
       where f1.duplicate_column= f2.duplicate_column
             AND f1.id > f2.id;

Upvotes: 0

Sree Gottumukkala
Sree Gottumukkala

Reputation: 21

delete from table t1 
where rowid > (SELECT min(rowid) FROM table t2 group by 
               t2.id,t2.name );

Upvotes: 1

Denis Valeev
Denis Valeev

Reputation: 6015

Try the steps described in this article: Removing duplicates from a PostgreSQL database.

It describes a situation when you have to deal with huge amount of data which isn't possible to group by.

A simple solution would be this:

DELETE FROM foo
       WHERE id NOT IN (SELECT min(id) --or max(id)
                        FROM foo
                        GROUP BY hash)

Where hash is something that gets duplicated.

Upvotes: 23

baklarz2048
baklarz2048

Reputation: 10938

The fastest is is join to the same table. http://www.postgresql.org/docs/8.1/interactive/sql-delete.html

CREATE TABLE test(id INT,id2 INT);
CREATE TABLE
mapy=# INSERT INTO test VALUES(1,2);
INSERT 0 1
mapy=# INSERT INTO test VALUES(1,3);
INSERT 0 1
mapy=# INSERT INTO test VALUES(1,4);
INSERT 0 1

DELETE FROM test t1 USING test t2 WHERE t1.id=t2.id AND t1.id2<t2.id2;
DELETE 2
mapy=# SELECT * FROM test;
 id | id2 
----+-----
  1 |   4
(1 row)

Upvotes: 2

adopilot
adopilot

Reputation: 4500

delete from table
where not id in 
(select max(id) from table group by [duplicate row])

This is random (max Value) choice which row you need to keep. If you have aggre whit this please provide more details

Upvotes: 2

Related Questions