Haifisch
Haifisch

Reputation: 909

Slow inner join in Oracle

I have Oracle database with a main table contain 9 000 000 rows and a second with 19 000 000 rows.

When I do :

SELECT *
FROM main m
INNER JOIN second s ON m.id = s.fk_id AND s.cd = 'E' AND s.line = 1

It's take 45 seconds to get the first part of the result, even with all the index below :

CREATE INDEX IDX_1 ON SECOND (LINE, CD, FK_ID, ID);
CREATE INDEX IDX_1 ON SECOND (LINE, CD);
MAIN (ID) AS PRIMARY KEY

Any idea how to do it faster ? I try some index, rebuild but it's always take 45 seconds

Here is the execution plan :

------------------------------------------------------------------------------------------------
| Id  | Operation            | Name                 | Rows    | Bytes      | Cost   | Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                      | 8850631 | 2133002071 | 696494 | 00:00:28 |
| * 1 |   HASH JOIN          |                      | 8850631 | 2133002071 | 696494 | 00:00:28 |
| * 2 |    TABLE ACCESS FULL | SECOND               | 8850631 |  646096063 | 143512 | 00:00:06 |
|   3 |    TABLE ACCESS FULL | MAIN                 | 9227624 | 1550240832 | 153363 | 00:00:06 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 1 - access("M"."ID"="S"."FK_ID")
* 2 - filter("S"."CD"='D' AND "S"."LINE"=1)

Thanks

Upvotes: 0

Views: 1265

Answers (1)

Marmite Bomber
Marmite Bomber

Reputation: 21095

If you want to see the first line quickly you have to enable Oracle to use the NESTED LOOP join.

This will required an index on second with the two columns you constraint in your query and an index on main on the join column id

create index second_idx on second(line,cd);
create index main_idx on main(id);

You'll see an execution plan similar to one below

--------------------------------------------------------------------------------------------
| Id  | Operation                     | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |            |    87 |  8178 |   178   (0)| 00:00:03 |
|   1 |  NESTED LOOPS                 |            |       |       |            |          |
|   2 |   NESTED LOOPS                |            |    87 |  8178 |   178   (0)| 00:00:03 |
|   3 |    TABLE ACCESS BY INDEX ROWID| SECOND     |    87 |  2523 |     4   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | SECOND_IDX |     1 |       |     3   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN           | MAIN_IDX   |     1 |       |     1   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY INDEX ROWID | MAIN       |     1 |    65 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   4 - access("S"."LINE"=1 AND "S"."CD"='E')
   5 - access("M"."ID"="S"."FK_ID")

You will access via index all rows in second with requested lineand cd (plan line 4 and 3) and for each such row you'll access via index the main table (lines 5 and 6)

This will provide an instant access to the first few rows and will work fine if there are a low number of rows in second table with the selected line and cd. In other case (when there is a large number of rows with s.cd = 'E' AND s.line = 1 - say 10k+) you will still see the first result rows quickly, but you'll wait ages to see the last row (it will take much more that the 45 seconds to finish the query).

If this is a problem you have to use a HASH JOIN (which you probaly do now).

A hash join typically doesn not use indexes and produced following execution plan

-----------------------------------------------------------------------------
| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |        | 10182 |  1153K|   908   (1)| 00:00:11 |
|*  1 |  HASH JOIN         |        | 10182 |  1153K|   908   (1)| 00:00:11 |
|*  2 |   TABLE ACCESS FULL| SECOND | 10182 |    99K|   520   (2)| 00:00:07 |
|   3 |   TABLE ACCESS FULL| MAIN   | 90000 |  9316K|   387   (1)| 00:00:05 |
-----------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - access("M"."ID"="S"."FK_ID")
   2 - filter("S"."LINE"=1 AND "S"."CD"='E')

Summary

To use the nested loops the indexes must be available as described above

The switch between nested loopsand hash join is done by the Oracle database (CBO) - provided that your tables statistics and database configuration are fine.

Upvotes: 2

Related Questions