Reputation: 904
I've a table Order
i my Oracle DB. Somehow the insert job ran twice and created duplicate records with only difference in timestamp:
ID | NAME | ITEM | DATE |
---|---|---|---|
1 | Ric | POTATO | 06-07-21 12:35:27.048000000 PM |
2 | Ric | POTATO | 06-07-21 12:35:27.048100000 PM |
3 | Ashley | KALE | 06-07-21 12:35:27.049000000 PM |
4 | Ashley | KALE | 06-07-21 12:35:27.049100000 PM |
5 | Adam | WATER | 06-07-21 12:36:27.050000000 PM |
5 | Adam | WATER | 06-07-21 12:36:27.050100000 PM |
Now I want to remove the most recent duplicate records.
How to create a query to delete the records where timestamp is latest for (Name, Item)
?
Upvotes: 0
Views: 847
Reputation: 21075
Provided that the column NAME
, ITEM
and "DATE"
(you'd better not use
reserved words) are unique keys you may use following query
DELETE tab
WHERE (NAME, ITEM, "DATE") IN
(
WITH del AS
(
SELECT ID,
NAME,
ITEM,
"DATE",
ROW_NUMBER() OVER ( PARTITION BY NAME, ITEM ORDER BY "DATE") AS rn
FROM tab
)
SELECT NAME,ITEM,"DATE"
FROM del
WHERE rn > 1
);
This guarantees that for each combination of NAME
, ITEM
only the row with the first "DATE" (rn = 1) preserves and all other will be deleted.
This query works even if there are keys with triplets or rows without duplication.
The unique key on NAME
, ITEM
and "DATE"
is important only to make the query deterministic.
Upvotes: 2
Reputation: 1269873
You can use:
delete from orders o
where o.date > (select min(o2.date)
from orders o2
where o2.name = o.name and o2.item = o.item
);
Or if you want to use not exists
, you can delete rows for which there is an earlier row:
delete from orders o
where exists (select 1
from orders o2
where o2.name = o.name and
o2.item = o.item and
o2.date < o.date
);
Only the "first" row doesn't have an earlier row.
Upvotes: 0