Reputation: 481
I have to select set of values by joining set of large tables and then update another large table from selected values. I am follow the below approach currently. But I see a performance drawback. What are the alternative ways available to do above kind of a work ?
DB server : Oracle EE
DECLARE
CURSOR c1
IS
SELECT update_data FOR UPDATE OF abc;
BEGIN
FOR update_data IN c1
LOOP
UPDATE klm
SET klm.xyz = update_data.cdf
WHERE update_data.abc = klm.abc;
END LOOP;
COMMIT;
END;
Upvotes: 0
Views: 283
Reputation: 10693
If you have performance issues with looping over each record, but the table is too big for a single update, you may consider updating in batches using BULK INTO ... LIMIT and FORALL.
CREATE TABLE klm (abc INTEGER, xyz INTEGER);
CREATE TABLE update_data (abc INTEGER, cdf INTEGER);
-- Have pairs of numbers (1000 rows)
INSERT INTO klm SELECT rownum, rownum FROM dual CONNECT BY level <= 1000;
-- Update every second row with 9999
INSERT INTO update_data SELECT rownum * 2, 9999 FROM dual CONNECT BY level <= 500;
DECLARE
CURSOR c1
IS
-- Select the key to be updated and the new value
SELECT abc, cdf FROM update_data;
-- Table type and table variable to store rows fetched from the cursor
TYPE t_update IS TABLE OF c1%rowtype;
update_tab t_update;
BEGIN
OPEN c1;
LOOP
-- Fetch next 30 rows into update table
FETCH c1 BULK COLLECT INTO update_tab LIMIT 30;
-- Exit when there were no more rows fetched
EXIT WHEN update_tab.count = 0;
-- This is the key point; uses update_tab to bulk-bind UPDATE statement
-- and run it for 30 rows in a single context switch
FORALL i IN 1..update_tab.count
UPDATE klm
SET klm.xyz = update_tab(i).cdf
WHERE update_tab(i).abc = klm.abc;
COMMIT;
END LOOP;
CLOSE c1;
END;
/
The rationale behind this is that Oracle actually has separate engines running SQL statements and PL/SQL programs. Whenever a procedure encounters an SQL statement, it hands it over to SQL engine for execution. This is called "context switch" and takes a significant amount of time, especially when done in a loop.
Bulk-binding aims to reduce this overhead by doing the context switch only once per [bulk size] records. Again, this is certainly not as effective as a single DML operation, but for large tables or complex queries it may be best feasible solution.
I've used above method to update tables with 100M-500M records with batch size of 10K-100K and it worked fine. But you need to experiment with batch size in your environment for best performance.
Upvotes: 0
Reputation: 3499
Most likely simple update will perform much better.
You can try with:
update klm t1
set xyz = ( select cdf from update_data t2 where t2.abc = t1.abc )
where exists ( select 1 from update_data t2 where t2.abc = t2.abc );
commit;
or if possible (has PK or unique index on update_data.abc )
update ( select t1.xyz, t2.cdf from klm t1, update_data t2 where t1.abc = t2.abc )
) set xyz = cdf;
commit;
Upvotes: 1