Reputation: 188
I have X million records in a table TABLE_A
and want to process these records one by one.
How can I divide the population equally among 10 instances of same PL/SQL scripts to process in parallel?
See below query
SELECT CBR.CUSTOMER_ID, CBR.NAME, CBR.DEPT_NAME
FROM
(
SELECT CUSTOMER_ID, NAME, HOME_TELNO, DEPT_NAME, ROWNUM AS RNUM
FROM TABLE_A ORDER BY CUSTOMER_ID ASC
) CBR
WHERE CBR.RNUM < :sqli_end_rownum AND CBR.RNUM >= :sqli_start_rownum ;
Values will be incremented in each iteration of loop. In next iteration sqli_start_rownum
will become sqli_end_rownum
.
This query is taking much time. Does someone has better way to do it
Upvotes: 0
Views: 194
Reputation: 3445
You could look into DBMS_PARALLEL_EXECUTE: http://docs.oracle.com/cd/E11882_01/appdev.112/e40758/d_parallel_ex.htm#ARPLS67331
For example: https://oracle-base.com/articles/11g/dbms_parallel_execute_11gR2
The poor man's version of this is basically to run a query to generate ranges of rowids. You can then access the rows in the table within a given range.
Step1: create the number of "buckets" you want to divide the table into and get a range of rowids for each bucket. Here's an 8-bucket example:
select bucket_num, min(rid) as start_rowid, max(rid) as end_rowid, count(*)
from (select rowid rid
, ntile(8) over (order by rowid) as bucket_num
from table_a
)
group by bucket_num
order by bucket_num;
You'd get an output that looks like this (I'm using 12c - rowids may look different in 11g):
BUCKET_NUM START_ROWID END_ROWID COUNT(*)
1 AABetTAAIAAB8GCAAA AABetTAAIAAB8u5AAl 82792
2 AABetTAAIAAB8u5AAm AABetTAAIAAB9RrABi 82792
3 AABetTAAIAAB9RrABj AABetTAAIAAB96vAAU 82792
4 AABetTAAIAAB96vAAV AABetTAAIAAB+gKAAs 82792
5 AABetTAAIAAB+gKAAt AABetTAAIAAB+/vABv 82792
6 AABetTAAIAAB+/vABw AABetTAAIAAB/hbAB1 82791
7 AABetTAAIAAB/hbAB2 AABetTAAIAACARDABf 82791
8 AABetTAAIAACARDABg AABetTAAIAACBGnABq 82791
(The sum of the counts will be the total number of rows in the table at the time of the query.)
Step2: can grab a set of rows from the table for a given range:
SELECT <whatever you need>
FROM <table>
WHERE rowid BETWEEN 'AABetTAAIAAB8GCAAA' and 'AABetTAAIAAB8u5AAl'
...
Step3: repeat step2 for the given ranges.
so instead of this:
SELECT CBR.CUSTOMER_ID, CBR.NAME, CBR.DEPT_NAME
FROM
(
SELECT CUSTOMER_ID, NAME, HOME_TELNO, DEPT_NAME, ROWNUM AS RNUM
FROM TABLE_A ORDER BY CUSTOMER_ID ASC
) CBR
WHERE CBR.RNUM < :sqli_end_rownum AND CBR.RNUM >= :sqli_start_rownum ;
you'll just have this:
SELECT CBR.CUSTOMER_ID, CBR.NAME, CBR.DEPT_NAME
FROM table_a
WHERE rowid BETWEEN :start_rowid and :end_rowid
You can use this to run the same job in parallel but you'll need a separate session for each run (e.g. multiple SQL Plus sessions. You can also use something like DBMS_JOBS/DBMS_SCHEDULER to launch background jobs.
(Note: always be aware if your table is being updated between the time the buckets are calculated and the time you access the tables as you can miss rows.)
Upvotes: 3