Reputation: 535
I have a problem with a piece of code, I can't understand why the below query is doing a full table scan on the works table when wrk.cre_surr_id is the primary key. The stats on both tables are both up to date below are the indexes on both tables.
TABLE INDEXES
WORKS
INDEX NAME UNIQUE LOGGING COLUMN NAME ORDER
WRK_I1 N NO LOGICALLY_DELETED_Y Asc
WRK_ICE_WRK_KEY N YES ICE_WRK_KEY Asc
WRK_PK Y NO CRE_SURR_ID Asc
WRK_TUNECODE_UK Y NO TUNECODE Asc
TLE_TITLE_TOKENS
INDEX NAME UNIQUE LOGGING COLUMN NAME ORDER
TTT_I1 N YES TOKEN_TYPE, Asc
SEARCH_TOKEN,
DN_WRK_CRE_SURR_ID
TTT_TLE_FK_1 N YES TLE_SURR_ID
Problem query below. It has a cost of 245,876 which seems high, it's doing a FULL TABLE SCAN of the WORKS table which has 21,938,384 rows in the table. It is doing an INDEX RANGE SCAN of the TLE_TITLE_TOKENS table which has 19,923,002 rows in it. On the explain plan also is an INLIST ITERATOR which I haven't a clue what it means but it I think it's to do with having an "in ('E','N')" in my sql query.
SELECT wrk.cre_surr_id
FROM works wrk,
tle_title_tokens ttt
WHERE ttt.dn_wrk_cre_surr_id = wrk.cre_surr_id
AND wrk.logically_deleted_y IS NULL
AND ttt.token_type in ('E','N')
AND ttt.search_token LIKE 'BELIEVE'||'%'
When I break the query down and do a simple select from the TLE_TITLE_TOKENS table I get 280,000 records back.
select ttt.dn_wrk_cre_surr_id
from tle_title_tokens ttt
where ttt.token_type in ('E','N')
and ttt.search_token LIKE 'BELIEVE'||'%'
How do I stop it doing a FULL TABLE scan on the WORKS table. I could put a hint on the query but I would have thought Oracle would be clever enough to know to use the index without a hint.
Also on TLE_TITLE_TOKENS table would it be better to create a fuction based index on the column SEARCH_TOKEN as users seem to do LIKE % searches on this field. What would that fuction based index look like.
I'm running on an Oracle 11g database.
Thanks in Advance to any answers.
Upvotes: 2
Views: 3620
Reputation: 21053
I'm assuming this index definition
create index works_idx on works (cre_surr_id,logically_deleted_y);
create index title_tokens_idx on tle_title_tokens(search_token,token_type,dn_wrk_cre_surr_id);
There are typically two possible scenarios to execute the join
NESTED LOOPS which access the inner table WORKS
using index, but repeatedly in a loop for each row in the outer table
HASH JOIN which access the WORKS
using FULL SCAN but only once.
It is not possible to say that one option is bad and the other good.
Nested loops is better if there are only few row in the outer table (few loops), but with increasing number of records in the outer table (TOKEN
) gets slower and
slower and at some number of row the HASH JOIN is bettwer.
How to see what execution plan is better? Simple force Oracle using hint to run both scanarios and compare the elapsed time.
In your case you should see those two execution plans
HASH JOIN
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 207K| 10M| | 2439 (1)| 00:00:30 |
|* 1 | HASH JOIN | | 207K| 10M| 7488K| 2439 (1)| 00:00:30 |
|* 2 | INDEX RANGE SCAN | TITLE_TOKENS_IDX | 207K| 5058K| | 29 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| WORKS | 893K| 22M| | 431 (2)| 00:00:06 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("TTT"."DN_WRK_CRE_SURR_ID"="WRK"."CRE_SURR_ID")
2 - access("TTT"."SEARCH_TOKEN" LIKE 'BELIEVE%')
filter("TTT"."SEARCH_TOKEN" LIKE 'BELIEVE%' AND ("TTT"."TOKEN_TYPE"='E' OR
"TTT"."TOKEN_TYPE"='N'))
3 - filter("WRK"."LOGICALLY_DELETED_Y" IS NULL)
NESTED LOOPS
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 207K| 10M| 414K (1)| 01:22:56 |
| 1 | NESTED LOOPS | | 207K| 10M| 414K (1)| 01:22:56 |
|* 2 | INDEX RANGE SCAN| TITLE_TOKENS_IDX | 207K| 5058K| 29 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN| WORKS_IDX | 1 | 26 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("TTT"."SEARCH_TOKEN" LIKE 'BELIEVE%')
filter("TTT"."SEARCH_TOKEN" LIKE 'BELIEVE%' AND
("TTT"."TOKEN_TYPE"='E' OR "TTT"."TOKEN_TYPE"='N'))
3 - access("TTT"."DN_WRK_CRE_SURR_ID"="WRK"."CRE_SURR_ID" AND
"WRK"."LOGICALLY_DELETED_Y" IS NULL)
My gues is the (with 280K loops) the hash join (i.e. FULLTABLE SCAN) will be bettwer, but it could be that you recognise that nested loops should be used. In this case the optimize doesn't correct recognise the switching point between nested loops and hash join. Common cause of this is wrong or missing system statistics or improper optimizer parameters.
Upvotes: 0
Reputation: 960
Out of the 19,923,002 rows in LE_TITLE_TOKENS,
How many records have TOKEN_TYPE 'E', how many have 'N'? Are there any other TokenTypes? If yes, then how many are they put together?
If E and N put together forms a small part of the total records, then check if histogram statistics are updated for that column.
The execution plan depends on how many records are being selected from LE_TITLE_TOKENS out of the 20M records for the given filters.
Upvotes: 0
Reputation: 1269533
First, rewrite the query using a join
:
SELECT wrk.cre_surr_id
FROM tle_title_tokens ttt JOIN
works wrk
ON ttt.dn_wrk_cre_surr_id = wrk.cre_surr_id
WHERE wrk.logically_deleted_y IS NULL AND
ttt.token_type in ('E', 'N') AND
ttt.search_token LIKE 'BELIEVE'||'%';
You should be able to speed this query by using indexes. It is not clear what the best index is. I would suggest either tle_title_tokens(search_token, toekn_type, dn_wrk_cre_surr_id)
and works(cre_surr_id, logically_deleted_y)
.
Another possibility is to write the query using EXISTS
, such as:
SELECT wrk.cre_surr_id
FROM works wrk
WHERE wrk.logically_deleted_y IS NULL AND
EXISTS (SELECT 1
FROM tle_title_tokens ttt
WHERE ttt.dn_wrk_cre_surr_id = wrk.cre_surr_id AND
ttt.token_type IN ('N', 'E') AND
ttt.search_token LIKE 'BELIEVE'||'%'
) ;
For this version, you want indexes on works(logically_deleted_y, cre_surr_id)
and tle_title_tokens(dn_wrk_cre_surr_id, token_type, search_token)
.
Upvotes: 1
Reputation: 96
try this:
SELECT /*+ leading(ttt) */ wrk.cre_surr_id
FROM works wrk,
tle_title_tokens ttt
WHERE ttt.dn_wrk_cre_surr_id = wrk.cre_surr_id
AND wrk.logically_deleted_y IS NULL
AND ttt.token_type in ('E','N')
AND ttt.search_token LIKE 'BELIEVE'||'%'
Upvotes: 0