Reputation: 86
I want to insert a number of rows based on the start and end date in a parent table. I tried the following query but it is running for an hour.
INSERT INTO CHILD_TABLE
with num as (
select level as rnk
from dual
connect by level<=300
),
Select Data, start_date,
end_date,rnk
From Paratent_table
Join num
ON (num.rnk <= end_date-start_date)
The parent table has more than a million rows.
Upvotes: 1
Views: 117
Reputation: 36922
With a CARDINALITY
hint, direct-path inserts, parallelism, and removing the common table expression, a generic version of your problem can be improved from 6 minutes to 22 seconds. There may be some other factor involved that explains why your version runs in 60 minutes and mine runs in 6 minutes. If that's still an issue after making these changes, use SQL monitoring to find out which operation and wait event is the problem.
Sample Schema
--drop table child_table;
--drop table parent_table;
create table parent_table(data varchar2(100), start_date date, end_date date);
create table child_table (data varchar2(100), start_date date, end_date date, rank number);
--Insert 1 million rows, gather statistics.
begin
for i in 1 .. 10 loop
insert into parent_table
select level, sysdate,sysdate+50
from dual
connect by level <= 100000;
end loop;
end;
/
begin
dbms_stats.gather_table_stats(user, 'parent_table');
end;
/
Original version - runs in 6-10 minutes on my old desktop
insert into child_table
with num as
(
select level as rnk
from dual
connect by level<=300
)
select data, start_date, end_date, rnk
from parent_table
join num
on num.rnk <= end_date-start_date;
commit;
truncate table child_table;
New version - runs in 22 seconds
insert /*+ append parallel */ into child_table
select data, start_date, end_date, rnk
from parent_table
join
(
select /*+ cardinality(300) */ level as rnk
from dual
connect by level<=300
) num
on num.rnk <= end_date-start_date;
Description of Changes
The CARDINALITY
hint informs the optimizer that the inline view returns 300 rows, whereas the original estimate is 1 row. The improved cardinality estimate changes the plan from a NESTED LOOP to a MERGE JOIN.
Direct-path inserts avoid most REDO generation. Although with the simple sample table, the INSERT
part of the statement is expensive anyway. It's possible that your real-world example spends a huge amount of time in INSERT
if there are many indexes or foreign keys that need to be validated. Note that without REDO the table changes are not automatically backed up.
Parallelism makes use of multiple resources and can be a real game-changer for performance. Parallelism does put more strain on a system and can be very "unfair" to other processes. And it requires enterprise edition, a sane configuration, etc.
Common table expressions (CTE) are meant for repetitive query blocks. CTEs sometimes cause performance issues, but not in this case. It's more a matter of style; inline views are easier to debug than CTEs but it's difficult to explain that here.
Finally, if none of the above work, you need to increase the granularity of your profiling. Most tuning methods only focus on which SQL statements are slow and then guess as which operation within that SQL statement is slow. There's no need to guess, Real-Time SQL Monitoring will tell you exactly how long each operation takes and what it's waiting for. Find the SQL_ID and run a statement like this: select dbms_sqltune.report_sql_monitor(sql_id => '13gdzd4w5fx4y', type => 'text') from dual;
Upvotes: 1