Shaun Kinnair
Shaun Kinnair

Reputation: 535

Oracle SQL query running slow, full table scan on primary key, why?

I have a problem with a piece of code, I can't understand why the below query is doing a full table scan on the works table when wrk.cre_surr_id is the primary key. The stats on both tables are both up to date below are the indexes on both tables.

 TABLE INDEXES
 WORKS

 INDEX NAME         UNIQUE  LOGGING     COLUMN NAME             ORDER
 WRK_I1             N       NO          LOGICALLY_DELETED_Y     Asc
 WRK_ICE_WRK_KEY    N       YES         ICE_WRK_KEY             Asc
 WRK_PK             Y       NO          CRE_SURR_ID             Asc
 WRK_TUNECODE_UK    Y       NO          TUNECODE                Asc

TLE_TITLE_TOKENS

INDEX NAME          UNIQUE  LOGGING     COLUMN NAME             ORDER
TTT_I1              N       YES         TOKEN_TYPE,             Asc
                                        SEARCH_TOKEN,
                                        DN_WRK_CRE_SURR_ID
TTT_TLE_FK_1        N       YES         TLE_SURR_ID

Problem query below. It has a cost of 245,876 which seems high, it's doing a FULL TABLE SCAN of the WORKS table which has 21,938,384 rows in the table. It is doing an INDEX RANGE SCAN of the TLE_TITLE_TOKENS table which has 19,923,002 rows in it. On the explain plan also is an INLIST ITERATOR which I haven't a clue what it means but it I think it's to do with having an "in ('E','N')" in my sql query.

 SELECT wrk.cre_surr_id 
 FROM   works wrk, 
        tle_title_tokens ttt      
 WHERE ttt.dn_wrk_cre_surr_id = wrk.cre_surr_id
 AND wrk.logically_deleted_y IS NULL   
 AND ttt.token_type in ('E','N')  
 AND  ttt.search_token LIKE 'BELIEVE'||'%'

When I break the query down and do a simple select from the TLE_TITLE_TOKENS table I get 280,000 records back.

 select ttt.dn_wrk_cre_surr_id 
 from tle_title_tokens ttt
 where ttt.token_type in ('E','N') 
 and ttt.search_token LIKE 'BELIEVE'||'%'

How do I stop it doing a FULL TABLE scan on the WORKS table. I could put a hint on the query but I would have thought Oracle would be clever enough to know to use the index without a hint.

Also on TLE_TITLE_TOKENS table would it be better to create a fuction based index on the column SEARCH_TOKEN as users seem to do LIKE % searches on this field. What would that fuction based index look like.

I'm running on an Oracle 11g database.

Thanks in Advance to any answers.

Upvotes: 2

Views: 3620

Answers (4)

Marmite Bomber
Marmite Bomber

Reputation: 21053

I'm assuming this index definition

 create index works_idx on works (cre_surr_id,logically_deleted_y);
 create index title_tokens_idx on  tle_title_tokens(search_token,token_type,dn_wrk_cre_surr_id); 

There are typically two possible scenarios to execute the join

NESTED LOOPS which access the inner table WORKS using index, but repeatedly in a loop for each row in the outer table

HASH JOIN which access the WORKS using FULL SCAN but only once.

It is not possible to say that one option is bad and the other good.

Nested loops is better if there are only few row in the outer table (few loops), but with increasing number of records in the outer table (TOKEN) gets slower and slower and at some number of row the HASH JOIN is bettwer.

How to see what execution plan is better? Simple force Oracle using hint to run both scanarios and compare the elapsed time.

In your case you should see those two execution plans

HASH JOIN

-----------------------------------------------------------------------------------------------
| Id  | Operation          | Name             | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                  |   207K|    10M|       |  2439   (1)| 00:00:30 |
|*  1 |  HASH JOIN         |                  |   207K|    10M|  7488K|  2439   (1)| 00:00:30 |
|*  2 |   INDEX RANGE SCAN | TITLE_TOKENS_IDX |   207K|  5058K|       |    29   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| WORKS            |   893K|    22M|       |   431   (2)| 00:00:06 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("TTT"."DN_WRK_CRE_SURR_ID"="WRK"."CRE_SURR_ID")
   2 - access("TTT"."SEARCH_TOKEN" LIKE 'BELIEVE%')
       filter("TTT"."SEARCH_TOKEN" LIKE 'BELIEVE%' AND ("TTT"."TOKEN_TYPE"='E' OR 
              "TTT"."TOKEN_TYPE"='N'))
   3 - filter("WRK"."LOGICALLY_DELETED_Y" IS NULL)

NESTED LOOPS

--------------------------------------------------------------------------------------
| Id  | Operation         | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                  |   207K|    10M|   414K  (1)| 01:22:56 |
|   1 |  NESTED LOOPS     |                  |   207K|    10M|   414K  (1)| 01:22:56 |
|*  2 |   INDEX RANGE SCAN| TITLE_TOKENS_IDX |   207K|  5058K|    29   (0)| 00:00:01 |
|*  3 |   INDEX RANGE SCAN| WORKS_IDX        |     1 |    26 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("TTT"."SEARCH_TOKEN" LIKE 'BELIEVE%')
       filter("TTT"."SEARCH_TOKEN" LIKE 'BELIEVE%' AND 
              ("TTT"."TOKEN_TYPE"='E' OR "TTT"."TOKEN_TYPE"='N'))
   3 - access("TTT"."DN_WRK_CRE_SURR_ID"="WRK"."CRE_SURR_ID" AND 
              "WRK"."LOGICALLY_DELETED_Y" IS NULL)

My gues is the (with 280K loops) the hash join (i.e. FULLTABLE SCAN) will be bettwer, but it could be that you recognise that nested loops should be used. In this case the optimize doesn't correct recognise the switching point between nested loops and hash join. Common cause of this is wrong or missing system statistics or improper optimizer parameters.

Upvotes: 0

ArtBajji
ArtBajji

Reputation: 960

Out of the 19,923,002 rows in LE_TITLE_TOKENS,

How many records have TOKEN_TYPE 'E', how many have 'N'? Are there any other TokenTypes? If yes, then how many are they put together?

If E and N put together forms a small part of the total records, then check if histogram statistics are updated for that column.

The execution plan depends on how many records are being selected from LE_TITLE_TOKENS out of the 20M records for the given filters.

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269533

First, rewrite the query using a join:

SELECT wrk.cre_surr_id 
FROM tle_title_tokens ttt JOIN
     works wrk 
     ON ttt.dn_wrk_cre_surr_id = wrk.cre_surr_id 
WHERE wrk.logically_deleted_y IS NULL  AND
      ttt.token_type in ('E', 'N')  AND
      ttt.search_token LIKE 'BELIEVE'||'%';

You should be able to speed this query by using indexes. It is not clear what the best index is. I would suggest either tle_title_tokens(search_token, toekn_type, dn_wrk_cre_surr_id) and works(cre_surr_id, logically_deleted_y).

Another possibility is to write the query using EXISTS, such as:

SELECT wrk.cre_surr_id 
FROM works wrk 
WHERE wrk.logically_deleted_y IS NULL AND
      EXISTS (SELECT 1
              FROM tle_title_tokens ttt
              WHERE ttt.dn_wrk_cre_surr_id = wrk.cre_surr_id AND
                    ttt.token_type IN ('N', 'E') AND
                    ttt.search_token LIKE 'BELIEVE'||'%'
             ) ;

For this version, you want indexes on works(logically_deleted_y, cre_surr_id) and tle_title_tokens(dn_wrk_cre_surr_id, token_type, search_token).

Upvotes: 1

saphsys
saphsys

Reputation: 96

try this:

SELECT /*+ leading(ttt) */ wrk.cre_surr_id 
 FROM   works wrk, 
        tle_title_tokens ttt      
 WHERE ttt.dn_wrk_cre_surr_id = wrk.cre_surr_id
 AND wrk.logically_deleted_y IS NULL   
 AND ttt.token_type in ('E','N')  
 AND  ttt.search_token LIKE 'BELIEVE'||'%'

Upvotes: 0

Related Questions