Mayank Tripathi
Mayank Tripathi

Reputation: 523

Deleting billions of records from Oracle Table containing trillions of records

I need to delete approx. of 30 billion of rows from an Oracle table that has about 100 + billion of rows. I have all the IDs of the rows I need to delete in a temporary table. Currently I am using single delete statement as below, also using SUBPARTITION and created Index on the temp_table. However this took 4+ hrs to complete in PRODUCTION.

DELETE FROM table_name SUBPARTITION(subpartition_name) WHERE id IN (SELECT id FROM temp_table);
COMMIT;

Is there a way I can optimize this to run bit fast.

Just for a note :

  1. The oracle table I am referring is common for multiple clients, so the below option is not suitable here. Creating new table and move the required data into it and drop the old table followed by renaming the new table to old table.
  2. Delete in Batch : Looping over the temp table and deleting something like below, is taking more time in non-prod environment, and not sure how it goes in production env.
DECLARE
  vCT NUMBER(38) := 0;

BEGIN
  FOR t IN (SELECT id FROM temp_table) LOOP
    DELETE FROM table_name WHERE id = t.id;
    ......
    ......
      COMMIT;
    END IF;
  END LOOP;
  COMMIT;
END;
  1. Option to create individual DELETE statement is also not feasible here as the record count is in Billions.

I did checked, there are partitions and sub-partitions on the table which I am utilizing it, and there is no child table dependent on it.

Please suggest.

Upvotes: 3

Views: 727

Answers (1)

Marmite Bomber
Marmite Bomber

Reputation: 21073

Delete (or update) large number of rows is pain and take lot of time.

The most effective way to manage it is to use the updatable join views

What is requested, but should be no problem is a unique index on your delta table containing the id to be deleted.

Also while using this approach be sure, that the large table has no unique index on the id column. There are some technical issues if both tables have unique index on the join column - see below.

Than use following query (if you want to use sub-partition wise) delete

delete from 
(
select delta.id, big.id big_id 
from delta 
join big subpartition (SYS_SUBP220880)
on delta.id = big.id  
)

Oracle uses the hash join of both tables which is in your case the only possibility to manage your scale. You may dedploy parallel option, do not forget to enable it.

This is the expected execution plan:

-------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------
|   0 | DELETE STATEMENT              |       |   100K|  1757K|   875   (2)| 00:00:01 |       |       |
|   1 |  DELETE                       | BIG   |       |       |            |          |       |       |
|*  2 |   HASH JOIN                   |       |   100K|  1757K|   875   (2)| 00:00:01 |       |       |
|   3 |    TABLE ACCESS FULL          | DELTA |   100K|   488K|    47   (3)| 00:00:01 |       |       |
|   4 |    PARTITION COMBINED ITERATOR|       |   783K|  9947K|   825   (1)| 00:00:01 |   KEY |   KEY |
|   5 |     TABLE ACCESS FULL         | BIG   |   783K|  9947K|   825   (1)| 00:00:01 |    65 |    65 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("DELTA"."ID"="BIG"."ID")

Note - in case that more than one joined tables are key preserved, see the documentation Note on Updatable Views

For a DELETE statement, if the join results in more than one key-preserved table, then Oracle Database deletes from the first table named in the FROM clause, whether or not the view was created WITH CHECK OPTION.

Upvotes: 3

Related Questions