Reputation: 546
I have the following table:
ONBackup Table:
Contract FromDate Invoice Data
232 12/12/2017 123
232 14/02/2018 123
232 15/07/2018 123
232 14/02/2017 676
311 12/12/2017 881
There are lots of "duplicate" rows, a duplicate to me is where the Invoice numbers are the same, ie the other fields can be different.
The table has 1.4 million rows (and around a million duplicates), so not sure if the below would work or not as I am getting bored waiting for 3 hours and counting, it must be more CPU hungry than I though.
DELETE FROM ONBackup
WHERE Invoice NOT IN
(
SELECT MIN(Invoice)
FROM ONBackup
GROUP BY Invoice
)
Is there a quicker way of doing this that will work?
Upvotes: 0
Views: 57
Reputation: 50163
Use row_number()
function :
delete b
from (select b.*, row_number() over (partition by b.invoice order by b.fromdate desc) as seq
from ONBackup b
) b
where seq > 1;
This leaves latest fromdate
for each invoice
s.
Upvotes: 5
Reputation: 5893
DELETE A
FROM
(
select *,row_number() over (partition by invoice order by invoice)as rn from
table1
) A
WHERE A.rn > 1
Upvotes: 3
Reputation: 2475
I think a CTE is a good option here: (note that you must end the previous statement with a semi-colon).
WITH CTE AS
(
SELECT Invoice, ROW_NUMBER() OVER (PARTITION BY INVOICE ORDER BY SELECT '1') AS RowNumb
FROM ONBackup
)
DELETE FROM CTE WHERE RowNumb > 1
Upvotes: 4