Reputation: 139
I'm using Oracle 10g and want to delete approx 5 million records from Persons table (Having total 15 million records) which doesn't have any reference in Order table (Having total 5 million records)
Since deteting 5 million records in one go would cause undo log issues, I've decided to delete in batches of 100k. I'm using the below query:
DELETE FROM Persons p
WHERE City = 'ABC'
AND NOT EXISTS
(SELECT O_Id
FROM Orders o
WHERE p.P_Id = o.P_Id)
AND ROWNUM <= 100000
Now the problem is that this query takes as long to execute for 100k records as it would have taken for 5 million because still full table scans and joins on both tables will happen.
Is there a efficient way to rewrite this query for faster execution? or replace NOT EXISTS clause with better join condition? or use some better way to limit records to 100k?
P.S. This is a one time operation only and I can't use any DDL operations for this, however pl/sql is fine
Upvotes: 5
Views: 43113
Reputation: 21
Solution 1 : The below approach will delete 1000 records at a time and will commit the commit the thousand deleted records .
--This collection will hold 1000 no of ids to be deleted
CREATE OR REPLACE ARRAYNAMETYPE IS VARARRAY(1000) OF NUMBER;
DECLARE
arrayNameType ARRAYNAMETYPE ;
recordsToBeDeleted NUMBER;
noOfRecordsDeleted NUMBER;
countDeleted NUMBER;
CURSOR CURSOR1
IS
SELECT ID_COLUMN FROM TABLE with conditions;
BEGIN
noOfRecordsDeleted :=0;
countDeleted :=0;
--Taking the count of the total no of records that is planned to delete
SELECT COUNT(*) INTO recordsToBeDeleted from table_name with where condition
open CURSOR1;
LOOP
FETCH CURSOR1 BULK COLLECT INTO arrayNameType LIMIT 1000;
EXIT WHEN arrayNameType.count=0;
delete from table where id in (select * from table (arrayNameType));
countDeleted:=SQL%ROWCOUNT;
noOfRecordsDeleted:= countDeleted+noOfRecordsDeleted
commit;
END LOOP;
CLOSE CURSOR1;
end;
Solution 2 : if 70% or above data has to be deleted then the below approach can be considered
1)Create a temp table and copy the data into this temp table which is not to be deleted 2)truncate the original table 3)insert the records from the temp table into the original table 4)drop the temp table. 5)indexes needs to be taken care of
Upvotes: 0
Reputation: 624
From my experience, the fastest way to delete lot of rows is :
solution 1 (recommended by Tom Kyte)
`SET TRANSACTION USE ROLLBACK SEGMENT <your_seg>
DELETE FROM <tab1> WHERE <cond>
COMMIT`
OR
solution 2
`create table new_table unrecoverable as select * from old_table where ....;
drop table old_table;
rename new_table to old_table;
create index old_table_idx1 on old_table(c1,c2) unrecoverable parallel 5;
`
I used the second solution in different contexts: it is always the fastest to delete huge amount of rows.
An alternative is to put the data to delete in a partition and then drop the partition (each partition has its own rollback segment, can use parallelism, ...).
Upvotes: 7
Reputation: 5565
One more way to delete:
begin
dbms_errlog.create_error_log('PERSONS');
end;
/
-- index on foreign key is useful thing in many cases, not only now
create index idx_orders_pid on orders(p_id);
declare
min_id number;
max_id number;
begin
select min(p_id), max(p_id)
into min_id, max_id
from persons;
for i in min_id..max_id loop
delete from persons where p_id between i and i + 100000
log errors into err$_persons reject limit unlimited;
end loop;
end;
/
drop table err$_persons;
Upvotes: -1
Reputation: 6336
DECLARE
v_limit PLS_INTEGER :=100000;
CURSOR person_deleted_cur
IS
SELECT rowid
FROM Persons p
WHERE City = 'ABC'
AND NOT EXISTS
(SELECT O_Id
FROM Orders o
WHERE p.P_Id = o.P_Id);
TYPE person_deleted_nt IS TABLE OF person_deleted_cur%ROWTYPE
INDEX BY PLS_INTEGER;
BEGIN
OPEN person_deleted_cur;
LOOP
FETCH person_deleted_cur
BULK COLLECT INTO person_deleted_nt LIMIT v_limit;
FORALL indx IN 1 .. person_deleted_nt.COUNT
DELETE FROM Persons WHERE rowid=person_deleted_nt(indx);
EXIT WHEN person_deleted_cur%NOTFOUND;
END LOOP;
CLOSE person_deleted_cur;
COMMIT;
END;
/
Upvotes: 2
Reputation: 1269843
If you want this query to run faster, add the following two indexes:
create index idx_persons_city_pid on persons(city, p_id);
create index idx_orders_pid on orders(p_id);
Upvotes: 5