Reputation: 145
I have a table like below
Order_Id (PK), Customer_ID, Order_Amount
A01 AA01 100
A01 AA01 100 (duplicate)
A02 AA01 200
A03 AA02 300
I want to remove the duplicate and retain the remaining one
I'm able to select the duplicate using below
select *
from
(
select order_id, customer_id, order_amount, row_number () over (partition by order_id order by order_Id) as rnk
from mytable
) where rnk > 1
But what's the next step to delete this record from mytable? Do I need to create a interim table?
Thanks!
Upvotes: 0
Views: 3244
Reputation: 54
Please try this query below it should work as your requirements,
delete from CUSTOMER
where rowid not in
(select max(rowid) from CUSTOMER group by order_amount);
Upvotes: 1
Reputation: 595
It should be works.
SELECT *
FROM
(
SELECT
order_id,
customer_id,
order_amount,
row_number () OVER (partition by order_id order by order_Id) as rnk
FROM mytable
GROUP BY order_id
) WHERE rnk > 1
Upvotes: 1
Reputation: 1269763
I don't think Redshift has a way to identify rows, if all the data is the same. So, I think your best bet is to recreate the table:
create table temp_mytable as
select distinct *
from mytable;
truncate table mytable;
insert into mytable
select *
from distinct mytable;
If your table really did have a primary key, there would be alternative ways of deleting rows.
Upvotes: 1