Reputation: 21
I am executing the below MERGE statement for Insert Update operation.
It is working fine for 1 to 2 million records but for more than 4 to 5 billion records it takes 6 to 7 hours to complete.
Can anyone suggest some alternative or performance tips for Merge Statement
merge into employee_payment ep
using (
select
p.pay_id vista_payroll_id,
p.pay_date pay_dte,
c.client_id client_id,
c.company_id company_id,
case p.uni_ni when 0 then null else u.unit_id end unit_id,
p.pad_seq pay_dist_seq_nbr,
ph.payroll_header_id payroll_header_id,
p.pad_id vista_paydist_id,
p.pad_beg_payperiod pay_prd_beg_dt,
p.pad_end_payperiod pay_prd_end_d
from
stg_paydist p
inner join company c on c.vista_company_id = p.emp_ni
inner join payroll_header ph on ph.vista_payroll_id = p.pay_id
left outer join unit u on u.vista_unit_id = p.uni_ni
where ph.deleted = '0'
) ps
on (ps.vista_paydist_id = ep.vista_paydist_id)
when matched then
update
set ep.vista_payroll_id = ps.vista_payroll_id,
ep.pay_dte = ps.pay_dte,
ep.client_id = ps.client_id,
ep.company_id = ps.company_id,
ep.unit_id = ps.unit_id,
ep.pay_dist_seq_nbr = ps.pay_dist_seq_nbr,
ep.payroll_header_id = ps.payroll_header_id
when not matched then
insert (
ep.employee_payment_id,
ep.vista_payroll_id,
ep.pay_dte,
ep.client_id,
ep.company_id,
ep.unit_id,
ep.pay_dist_seq_nbr,
ep.payroll_header_id,
ep.vista_paydist_id
) values (
seq_employee_payments.nextval,
ps.vista_payroll_id,
ps.pay_dte,
ps.client_id,
ps.company_id,
ps.unit_id,
ps.pay_dist_seq_nbr,
ps.payroll_header_id,
ps.vista_paydist_id
) log errors into errorlog (v_batch || 'EMPLOYEE_PAYMENT') reject limit unlimited;
Upvotes: 0
Views: 16424
Reputation: 109
I was working on a similar problem and a good solution i found was to break the query up. The primary reason big table merges are a bad idea is because of the in memory storage of the result of the using query. Because the PGA gets filled up pretty quickly so the database starts using the temporary table space of sort operations and joins. The temp tablespace being on disk is excruciatingly slow. The use of excessive temp table space can be easily avoided by splitting the query into two queries. So the below query
merger into emp e
using (
select a,b,c,d from (/* big query here */)
) ec
on /*conditions*/
when matched then
/* rest of merge logic */
can become
create table temp_big_query as select a,b,c,d from (/* big query here */);
merger into emp e
using (
select a,b,c,d from temp_big_query
) ec
on /*conditions*/
when matched then
/* rest of merge logic */
if the using query also has CTEs and sub queries try breaking that query up to use more temp tables like the one shown above. Also avoid using parallel hints because they mostly tend to slow the query down unless the query itself has something that can be done in parallel, try using indexes instead instead as much as possible parallel should be used as the last option for optimization. I know some references are missing please feel free to comment and add references or point out mistakes in my answer.
Upvotes: 0
Reputation: 41
Try using the Oracle hints:
MERGE /*+ append leading(PS) use_nl(PS EP) parallel (12) */
Try to using hints to optimize inner using query.
Upvotes: 3
Reputation: 12843
Processing lots of data takes lots of time... Here are some things that may help you (assuming there is not a probolem with bad execution plan):
Adding a where-clause in the UPDATE-part to only update records when the values are actually different. If you are merging the same data over and over again and only a smaller subset of the data is actually modified, this will improve performance.
If you indeed are processing the same data over and over again, investigate whether you can add some modification flag/date to only process new records since last time.
Depending on the kind of environment and when/who is updating your source tables, investigate whether a truncate-insert approach is beneficial. Remember to set the indexes unusuable on before hand.
I think your best bet here is to exploit the patterns in your data. This is something oracle does not know about, so you may have to get creative.
Upvotes: 1