user12562215
user12562215

Reputation: 145

SQL - Remove all duplicates and retain just one

I have a table like below

Order_Id (PK), Customer_ID, Order_Amount
  A01             AA01         100
  A01             AA01         100 (duplicate)
  A02             AA01         200
  A03             AA02         300

I want to remove the duplicate and retain the remaining one

I'm able to select the duplicate using below

select *
from 
(
select order_id, customer_id, order_amount, row_number () over (partition by order_id order by order_Id) as rnk
from mytable
) where rnk > 1

But what's the next step to delete this record from mytable? Do I need to create a interim table?

Thanks!

Upvotes: 0

Views: 3244

Answers (3)

Naresh
Naresh

Reputation: 54

Please try this query below it should work as your requirements,

delete from CUSTOMER
where rowid not in
(select max(rowid) from CUSTOMER group by order_amount);

Upvotes: 1

Widada
Widada

Reputation: 595

It should be works.

SELECT *
FROM 
(
  SELECT 
     order_id, 
     customer_id, 
     order_amount, 
     row_number () OVER (partition by order_id order by order_Id) as rnk
  FROM mytable
  GROUP BY order_id
) WHERE rnk > 1

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269763

I don't think Redshift has a way to identify rows, if all the data is the same. So, I think your best bet is to recreate the table:

create table temp_mytable as
    select distinct *
    from mytable;

truncate table mytable;

insert into mytable
    select *
    from distinct mytable;

If your table really did have a primary key, there would be alternative ways of deleting rows.

Upvotes: 1

Related Questions