Reputation: 86
I have a dimension which has around 11.000.000 records, 4 tables are joined to each other in ETL process to fill my dimension(dimtst).
insert into dimtst
select .... from tdpst left outer join fhist
left outer join dp2act
left outer join dp2cust
above query lasts alot(tdpst
has 11 records and above query lasts around 15 minutes), Hence I created a temp table which joined
tdpst
and fhist
and stored the result in tmpDpsInf
(another temporary table which I created).
tmpDpsInf
will join with C
and D
in another query which has adequate execution time.create table TMPFHIST
(
abrnchcod NUMBER(4) not null,
tbdptype NUMBER(3) not null,
cfcifno NUMBER(8) not null,
tdserial NUMBER(3) not null,
aistate NUMBER(1)
)
--------------
create table TMPDPSINF
(
abrnchcod NUMBER(4) not null,
tbdptype NUMBER(3) not null,
cfcifno NUMBER(8) not null,
tdserial NUMBER(3) not null,
ausrcode NUMBER(4) not null,
tdtitle VARCHAR2(82),
tdopndat DATE,
tdrnwdat DATE,
tdclsdat DATE,
acurrcode CHAR(3) not null,
abu_abrnchcod NUMBER(4) not null,
aistate NUMBER(1)
)
create table tdpst
(
abrnchcod NUMBER(4) not null,
tbdptype NUMBER(3) not null,
cfcifno NUMBER(8) not null,
tdserial NUMBER(3) not null,
ausrcode NUMBER(4) not null,
tdtitle VARCHAR2(82),
tdopndat DATE,
tdrnwdat DATE,
tdclsdat DATE,
acurrcode CHAR(3) not null,
abu_abrnchcod NUMBER(4) not null
)
--------
tmpDpsInf fills with below query:
insert into tmpDpsInf
select /*+parallel(12)*/
d.ABRNCHCOD,
d.TBDPTYPE,
d.CFCIFNO,
d.TDSERIAL,
d.AUSRCODE AUSRCODE,
d.tdtitle,
trunc(d.tdopndat) TDOPNDAT,
nvl(d.tdrnwdat, d.tdopndat),
nvl(d.tdclsdat, to_date('1500/01/01', 'yyyy/mm/dd')) tdclsdat,
d.acurrcode acurrcode,
d.ABU_ABRNCHCOD ABU_ABRNCHCOD,
tmp.aistate
from tdpst d
left outer join fhist tmp
on d.ABRNCHCOD = tmp.ABRNCHCOD
and d.TBDPTYPE = tmp.TBDPTYPE
and d.CFCIFNO = tmp.CFCIFNO
and d.TDSERIAL = tmp.TDSERIAL
where d.TDOPNDAT <= currdate
1 Plan hash value: 3720425100
2
3 --------------------------------------------------------------------------------------------------------------------------
4 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
5 --------------------------------------------------------------------------------------------------------------------------
6 | 0 | SELECT STATEMENT | | 12M| 1994M| 4248 (3)| 00:01:17 | | | |
7 | 1 | PX COORDINATOR | | | | | | | | |
8 | 2 | PX SEND QC (RANDOM) | :TQ10002 | 12M| 1994M| 4248 (3)| 00:01:17 | Q1,02 | P->S | QC (RAND) |
9 |* 3 | HASH JOIN RIGHT OUTER BUFFERED| | 12M| 1994M| 4248 (3)| 00:01:17 | Q1,02 | PCWP | |
10 | 4 | PX RECEIVE | | 3730K| 67M| 178 (3)| 00:00:04 | Q1,02 | PCWP | |
11 | 5 | PX SEND HASH | :TQ10000 | 3730K| 67M| 178 (3)| 00:00:04 | Q1,00 | P->P | HASH |
12 | 6 | PX BLOCK ITERATOR | | 3730K| 67M| 178 (3)| 00:00:04 | Q1,00 | PCWC | |
13 | 7 | TABLE ACCESS FULL | TMPFHIST | 3730K| 67M| 178 (3)| 00:00:04 | Q1,00 | PCWP | |
14 | 8 | PX RECEIVE | | 12M| 1774M| 4059 (2)| 00:01:14 | Q1,02 | PCWP | |
15 | 9 | PX SEND HASH | :TQ10001 | 12M| 1774M| 4059 (2)| 00:01:14 | Q1,01 | P->P | HASH |
16 | 10 | PX BLOCK ITERATOR | | 12M| 1774M| 4059 (2)| 00:01:14 | Q1,01 | PCWC | |
17 |* 11 | TABLE ACCESS FULL | TDPST | 12M| 1774M| 4059 (2)| 00:01:14 | Q1,01 | PCWP | |
18 --------------------------------------------------------------------------------------------------------------------------
19
20 Predicate Information (identified by operation id):
21 ---------------------------------------------------
22
23 3 - access("D"."TDSERIAL"="TMP"."TDSERIAL"(+) AND "D"."CFCIFNO"="TMP"."CFCIFNO"(+) AND
24 "D"."TBDPTYPE"="TMP"."TBDPTYPE"(+) AND "D"."ABRNCHCOD"="TMP"."ABRNCHCOD"(+))
25 11 - filter("D"."TDOPNDAT"<=TO_DATE('20130601','yyyy/mm/dd'))
26
27 Note
28 -----
29 - dynamic sampling used for this statement (level=4)
30 - Degree of Parallelism is 10 because of hint
I executed above query with /+leading (d,tmp)/ hint.the plan is shown below:
Plan hash value: 1033900074
| 0 | SELECT STATEMENT | | 12M| 1994M| 4255 (3)| 00:01:17 | | | | | 1 | PX COORDINATOR | | | | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10002 | 12M| 1994M| 4255 (3)| 00:01:17 | Q1,02 | P->S | QC (RAND) | |* 3 | HASH JOIN OUTER BUFFERED| | 12M| 1994M| 4255 (3)| 00:01:17 | Q1,02 | PCWP | | | 4 | PX RECEIVE | | 12M| 1774M| 4059 (2)| 00:01:14 | Q1,02 | PCWP | | | 5 | PX SEND HASH | :TQ10000 | 12M| 1774M| 4059 (2)| 00:01:14 | Q1,00 | P->P | HASH | | 6 | PX BLOCK ITERATOR | | 12M| 1774M| 4059 (2)| 00:01:14 | Q1,00 | PCWC | | |* 7 | TABLE ACCESS FULL | TDPST | 12M| 1774M| 4059 (2)| 00:01:14 | Q1,00 | PCWP | | | 8 | PX RECEIVE | | 3730K| 67M| 178 (3)| 00:00:04 | Q1,02 | PCWP | | | 9 | PX SEND HASH | :TQ10001 | 3730K| 67M| 178 (3)| 00:00:04 | Q1,01 | P->P | HASH | | 10 | PX BLOCK ITERATOR | | 3730K| 67M| 178 (3)| 00:00:04 | Q1,01 | PCWC | |
3 - access("D"."TDSERIAL"="TMP"."TDSERIAL"(+) AND "D"."CFCIFNO"="TMP"."CFCIFNO"(+) AND "D"."TBDPTYPE"="TMP"."TBDPTYPE"(+) AND "D"."ABRNCHCOD"="TMP"."ABRNCHCOD"(+)) 7 - filter("D"."TDOPNDAT"<=TO_DATE('20130601','yyyy/mm/dd'))
I created one index on d.TDOPNDAT
and one index on (tmp.ABRNCHCOD, tmp.TBDPTYPE, tmp.CFCIFNO,tmp.TDSERIAL)
.In addition to,Optimizer didn't use any index, I force optimizer to use created indexes,but query cost increases Exponentially!! after doing all of mentioned works, query time is still high!
dos any one have any suggestion to reduce this query time ? Thanks
Upvotes: 0
Views: 1251
Reputation: 3411
Creating the intermediate table shouldn't be necessary, as, given the correct information, the query optimiser should be able to make a correct decision on the fastest way to do this in one step without requiring you to work out the intermediate step. Likewise with indexes, generally the query optimiser will get the decision right on whether or not to use them - as long as it has the right inputs (in the form of stats / constraints etc).
Indexes are not always good - they're powerful where a small subset of the data in the table needs to be found, but in cases where you're reading the whole table they won't help you much.
As a starting point, ask Oracle what plan it is using the execute your long running query. This can be done with:
explain plan for
select .... from tdpst left outer join fhist
left outer join dp2act
left outer join dp2cust;
Then,
select * from table(dbms_xplan.display);
NOTE - I don't have a database on this laptop (it is the weekend), so there may be some typos in the commands above, see the docs if they don't work as expected.
Look at the results and consider: How is it joining the tables - is it using a hash, a merge join or nested loops? Is it joining the table in the right order.
My guess here is that it will be making an incorrect decision on the join order or join type. Working on that assumption, the next thing I would check would be the cardinality estimates. Run the query again with a /*+ GATHER_PLAN_STATISTICS */ hint (you'll need to actually run it with this hint in - not just explain plan it again) and check the difference between the estimated and actual row numbers using:
SELECT * FROM table(DBMS_XPLAN.DISPLAY_CURSOR(FORMAT=>'ALLSTATS LAST'));
Check for big differences between the estimated rows and actual rows. A big difference mean a problem in the statistics that is leading the optimizer astray. From this point you can consider.
Then repeat and see if you get a performance improvement
Good luck..
Upvotes: 1
Reputation: 196
First off, its typically best practice to let the engine handle index selection. If an index isnt being used, it's usually because criteria for the join doesn't allow the index to be utilised correctly. Thats why your execution costs increase.
Temp tables are a good alternative if you have minimal IO and your temp database on a SSD drive. Environment is a factor here. If your data selection is large, consider pagination.
Otherwise I would recommend an indexed view. Your columns will be specified with your join and allowing index to be pre-computed for optimal results. I think Oracle calls them materialized views. It should also force computation of the join if you index fields from each of the tables.
If my answer is wrong, feel free to correct me. Most of my experience is from MSSQL. My Oracle experience is a little limited
Upvotes: 0