Reputation: 733
I have a very huge table with data collected from multiple systems. I now have to delete duplicate records based on conditions and multiple columns.
Here is an example:
+---------------+-------------+------+-----+--------
| System ID | Debt Num | Exp Dt | Account NO |
+---------------+-------------+------+-----+---------
| pay | 2222 | 0114 | 111 |
| pay | 2222 | 0214 | 111 |
| Online | 2222 | 0214 | 111 |
| Online | 3333 | 0115 | 222 |
| Online | 3333 | 0116 | 222 |
| ERP | 2222 | 0214 | 111 |
| ERP | 4444 | 0114 | 333 |
+---------------+-------------+------+-----+--------
From the above data , delete the duplicates with the following conditions satisfied.
System ID
. 1) pay 2) Online and 3) ERP. In the above for the account 111 we have records from all three systems and max(exp dt) of the debit card is 0214 of all three system. Only the record from Pay
with exp dt=0214 should be retained and rest should be deleted. I have tried multiple queries found online , like group by, row_number over but all fulfill just one condition.
Appreciate your help to contribute your thoughts and suggestions.
EDIT
:
Gordon's query works fine and fits my requirements, however when I ran the same on staging which contains 540K rows, it error out ORA-00600 Internal error.
Upvotes: 0
Views: 1191
Reputation: 1271051
I think you can do this with rowid
and a correlated subquery:
delete from payinfo_staging_db
where rowid <> (select rowid
from (select rowid
from payinfo_staging_db t2
where t2.debitNum = payinfo_staging_db.debitNum and
t2.accountNo = payinfo_staging_db.accountNo
order by t2.exp_dt,
(case when t2.SystemId = 'Pay' then 1
when t2.SystemId = 'Online' then 2
when t2.SystemId = 'ERP' then 3
end)
) r
where rownum = 1
);
EDIT:
There must be a problem with nested references in Oracle. The following works (at least in the sense that it parses and executes correctly):
delete from payinfo_staging_db
where rowid <> (select min(rowid) keep (dense_rank first order by exp_dt desc,
(case when t2.SystemId = 'Pay' then 1
when t2.SystemId = 'Online' then 2
when t2.SystemId = 'ERP' then 3
end)
) as therowid
from payinfo_staging_db t2
where t2.debitNum = payinfo_staging_db.debitNum and
t2.accountNo = payinfo_staging_db.accountNo
);
The SQL Fiddle is here.
Upvotes: 1