Reputation: 36640
I have a temp table created from a copy
from a CSV file and the result includes some duplicate ids. I need to delete any duplication. I have tried the following:
delete from my_table where id in
(select id from (select count(*) as count, id
from my_table group by id) as counts where count>1);
However this deletes both the duplicate records and I must keep one.
How can I delete only the 2nd record with a duplicated Id?
Thanks.
Upvotes: 2
Views: 4347
Reputation: 9211
Your query deletes all IDs that have a count greater than 1, so it removes everything that is duplicated. What you need to do is isolate one record from the list of duplicates and preserve that:
delete
from my_table
where id in (select id
from my_table
where some_field in (select some_field
from my_table
group by some_field
having count(id) > 1))
and id not in (select min(id)
from my_table
where some_field in (select some_field
from my_table
group by some_field
having count(id) > 1)
group by some_field);
EDIT Fixed :P
Upvotes: 2
Reputation: 26921
Assuming you don't have foreign key relations...
CREATE TABLE "temp"(*column definitions*);
insert into "temp" (*column definitions*)
select *column definitions*
from (
select *,row_number() over(PARTITION BY id) as rn from "yourtable"
) tm
where rn=1;
drop table "yourtable";
alter table "temp" rename to "yourtable";
Upvotes: 1