Reputation: 1669
I want to remove duplicated rows in sql. My table looks like that:
CREATE TABLE test_table
(
id Serial,
Date Date,
Time Time,
Open double precision,
High double precision,
Low double precision
);
DELETE FROM test_table
WHERE ctid IN (SELECT min(ctid)
FROM test_table
GROUP BY id
HAVING count(*) > 1);
with the below delete
statement I am searching in the secret column ctid
for duplicated entries and delete them. However this does not work correctly. The query gets executed properly, but does not delete anything.
I appreciate your answer!
UPDATE
This is some sample data(without the generated id
):
2013.11.07,12:43,1.35162,1.35162,1.35143,1.35144
2013.11.07,12:43,1.35162,1.35162,1.35143,1.35144
2013.11.07,12:44,1.35144,1.35144,1.35141,1.35142
2013.11.07,12:45,1.35143,1.35152,1.35143,1.35151
2013.11.07,12:46,1.35151,1.35152,1.35149,1.35152
Upvotes: 1
Views: 465
Reputation: 477
Not sure if you can use row_number with partiontions in postgresql but if so you can do this to find duplicates, you can add or substract columns from the partion by to define what duplicates are in the set
WITH cte AS
(
SELECT id,ROW_NUMBER() OVER(PARTITION BY Date, Time ORDER BY date, time) AS rown
FROM test_table
)
delete From test_table
where id in (select id from cte where rown > 1);
Upvotes: 0
Reputation: 3085
Get out of the habit of using ctid
, xid
, etc. - they're not advertised for a reason.
One way of dealing with duplicate rows in one shot, depending on how recent your postgres version is:
with unique_rows
as
(
select distinct on (id) *
from test_table
),
delete_rows
as
(
delete
from test_table
)
insert into test_table
select *
from unique_rows
;
Or break everything up in three steps and use temp tables:
create temp table unique_rows
as
select distinct on (id) *
from test_table
;
create temp table delete_rows
as
delete
from test_table
;
insert into test_table
select *
from unique_rows
;
Upvotes: 2