Reputation: 793
I have data table in Oracle 8,1. There are about a million rows. But lots of rows duplicates by the same columns. I need to know fastest way to clear this data. For example I have:
id name surname date
21 'john' 'smith' '2012 12 12';
21 'john' 'smith' '2012 12 13';
21 'john' 'smith' '2012 12 14';
....
And now I need to delete first two rows as they duplicates by first three columns and keep the row with the latest date.
Upvotes: 0
Views: 1105
Reputation: 8361
If there are really lots of duplicates, I'd recommend to recreate the table with only the clean data:
CREATE TABLE tmp AS
SELECT id, name, surname, max(d) as d
FROM t
GROUP BY id, name, surname;
and then replace the original table with the original table:
RENAME your_table TO old_table;
RENAME tmp_table TO your_table;
Don't forget to move indexes, constraints and privileges...
Upvotes: 2
Reputation: 30765
If possible, I'd go for a CTAS (create table as select), truncate the original table, and copy the data back:
-- create the temp table (it contains only the latest values for a given (id, name, surname) triple
CREATE TABLE tmp as
SELECT id, name, surname, date1 from
(select
t1.*,
row_number() over (partition by id, name, surname order by date1 desc) rn
from mytab t1)
where rn = 1;
-- clear the original table
TRUNCATE TABLE mytab;
-- copy the data back
INSERT /* +APPEND */ INTO mytab(id,name,surname,date1)
(SELECT id,name,surname,date1 from tmp);
Upvotes: 1
Reputation: 4231
delete from table t where
exists (select * from table where id=t.id and name=t.name and surname=t.surname
and date > t.date)
How fast this is depends con your Oracle parameters. And index on (id,name,surname) might help.
Upvotes: 1