Reputation: 43
I have two tables A & B. A has 6,760,636 records and B has 452,175,960 records. Here is the SELECT Statement I'm using:
SELECT /*+ parallel (T,1) */
T.*
FROM TABLE_A T,
TABLE_B P
WHERE T.DESTINATION = P.DESTINATION
AND T.SAIL_DATE = P.SAIL_DATE
AND T.PACKAGE_TYPE = P.PACKAGE_TYPE
AND T.CABIN_CATEGORY = P.CABIN_CATEGORY
AND T.BOOKING_SOURCE = P.BOOKING_SOURCE
AND T.FARE_TYPE = P.FARE_TYPE
AND T.POST_DATE = P.POST_DATE;
I tried creating the index on TABLE_A, but still it is not considering the INDEX and doing a FULL TABLE SCAN.
The EXPLAIN PLAN for above is
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| TQ |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 6760K| 1805M| 747K (1)| | | |
| 1 | SORT AGGREGATE | | 1 | 48 | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| CL_PRICING_CONTROLS | 1 | 48 | 2 (0)| | | |
| 3 | INDEX RANGE SCAN | CL_PRICING_CONTROLS_IX3 | 1 | | 1 (0)| | | |
| 4 | PX COORDINATOR | | | | | | | |
| 5 | PX SEND QC (RANDOM) | :TQ10002 | 6760K| 1805M| 747K (1)| Q1,02 | P->S | QC (RAND) |
| 6 | HASH JOIN | | 6760K| 1805M| 747K (1)| Q1,02 | PCWP | |
| 7 | PX RECEIVE | | 6760K| 1437M| 1443 (1)| Q1,02 | PCWP | |
| 8 | PX SEND HASH | :TQ10001 | 6760K| 1437M| 1443 (1)| Q1,01 | P->P | HASH |
| 9 | PX BLOCK ITERATOR | | 6760K| 1437M| 1443 (1)| Q1,01 | PCWC | |
| 10 | TABLE ACCESS FULL | TMP_RES_PRICE_CONTROL_111 | 6760K| 1437M| 1443 (1)| Q1,01 | PCWP | |
| 11 | BUFFER SORT | | | | | Q1,02 | PCWC | |
| 12 | PX RECEIVE | | 450M| 23G| 746K (1)| Q1,02 | PCWP | |
| 13 | PX SEND HASH | :TQ10000 | 450M| 23G| 746K (1)| | S->P | HASH |
| 14 | INDEX FULL SCAN | CL_PRICING_CONTROLS_IX1 | 450M| 23G| 746K (1)| | | |
I tried parallel query on both tables as below
SELECT /*+ PARALLEL(T, 32) PARALLEL(P, 32)*/
T.*
FROM TABLE_A T,
TABLE_B P
WHERE T.DESTINATION = P.DESTINATION
AND T.SAIL_DATE = P.SAIL_DATE
AND T.PACKAGE_TYPE = P.PACKAGE_TYPE
AND T.CABIN_CATEGORY = P.CABIN_CATEGORY
AND T.BOOKING_SOURCE = P.BOOKING_SOURCE
AND T.FARE_TYPE = P.FARE_TYPE
AND T.POST_DATE = P.POST_DATE;
But this results in the EXPLAIN PLAN as below, where I see CPU with lower Cost, but still it is taking two hours.
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| TQ |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 6760K| 1805M| 59345 (1)| | | |
| 1 | SORT AGGREGATE | | 1 | 48 | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| CL_PRICING_CONTROLS | 1 | 48 | 2 (0)| | | |
| 3 | INDEX RANGE SCAN | CL_PRICING_CONTROLS_IX3 | 1 | | 1 (0)| | | |
| 4 | PX COORDINATOR | | | | | | | |
| 5 | PX SEND QC (RANDOM) | :TQ10002 | 6760K| 1805M| 59345 (1)| Q1,02 | P->S | QC (RAND) |
| 6 | HASH JOIN BUFFERED | | 6760K| 1805M| 59345 (1)| Q1,02 | PCWP | |
| 7 | PX RECEIVE | | 6760K| 1437M| 1443 (1)| Q1,02 | PCWP | |
| 8 | PX SEND HASH | :TQ10000 | 6760K| 1437M| 1443 (1)| Q1,00 | P->P | HASH |
| 9 | PX BLOCK ITERATOR | | 6760K| 1437M| 1443 (1)| Q1,00 | PCWC | |
| 10 | TABLE ACCESS FULL | TMP_RES_PRICE_CONTROL_111 | 6760K| 1437M| 1443 (1)| Q1,00 | PCWP | |
| 11 | PX RECEIVE | | 450M| 23G| 57858 (1)| Q1,02 | PCWP | |
| 12 | PX SEND HASH | :TQ10001 | 450M| 23G| 57858 (1)| Q1,01 | P->P | HASH |
| 13 | PX BLOCK ITERATOR | | 450M| 23G| 57858 (1)| Q1,01 | PCWC | |
| 14 | TABLE ACCESS FULL | CL_PRICING_CONTROLS | 450M| 23G| 57858 (1)| Q1,01 | PCWP | |
Upvotes: 1
Views: 89
Reputation: 146209
Please look at your query, specifically at its WHERE clause. It consists entirely of join conditions. It could be rewritten using the ANSI-92 syntax to this:
SELECT /*+ parallel (T,1) */
T.*
FROM TABLE_A T
inner join TABLE_B P
on T.DESTINATION = P.DESTINATION
AND T.SAIL_DATE = P.SAIL_DATE
AND T.PACKAGE_TYPE = P.PACKAGE_TYPE
AND T.CABIN_CATEGORY = P.CABIN_CATEGORY
AND T.BOOKING_SOURCE = P.BOOKING_SOURCE
AND T.FARE_TYPE = P.FARE_TYPE
AND T.POST_DATE = P.POST_DATE;
So every row in each table has to be considered. Clearly full table scans are the only plausible access path.
Building a composite index on all columns in the JOIN clause is unlikely to change this. You are selecting all the columns from TABLE_A, so the database needs to visit the table anyway.
Unless the number of records in the intersection between the two tables is incredibly small it will be more efficient to read the tables in multi-block reads rather than index scans with table row look-ups. As it is, you're selecting almost one-in-six of all the rows in a half-billion row table. How would an index make that faster?
Incidentally, how did you decide on the degree of parallelism? How many CPUs does your server have? What other processes are running concurrently? What is the value of MAX_PARALLEL_SERVERS? Find out more.
Upvotes: 1