Reputation: 1784
Consider the following tables;
SQL> desc test;
Name Null? Type
----------------------------------------- -------- ----------------------------
NUM NOT NULL NUMBER
NUM2 NUMBER(10)
NUM3 NUMBER
NUM NUM2 NUM3
---------- ---------- ----------
1 1 1
2 2 2
SQL> desc test2;
Name Null? Type
----------------------------------------- -------- ----------------------------
NUM NOT NULL NUMBER
NUM2 NOT NULL NUMBER
NUM3 NUMBER
NUM NUM2 NUM3
---------- ---------- ----------
1 1 1
3 1 1
As per the book , if one were to replace a NOT IN condition to a LEFT Outer join ; the query performance would increase.
Explain plan 1
SQL> select *
from test
where num NOT IN (select num
from test2);
NUM NUM2 NUM3
--- --------- ----------
2 2 2
Execution Plan
----------------------------------------------------------
Plan hash value: 856752680
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 | 5 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL| TEST | 2 | 4 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| TEST2 | 2 | 26 | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( NOT EXISTS (SELECT /*+ */ 0 FROM "TEST2" "TEST2" WHERE
LNNVL("NUM"<>:B1)))
3 - filter(LNNVL("NUM"<>:B1))
So I created indexes on num columns of both tables and re-wrote the query as :
Explain plan 2
SQL> select *
from test
left join test2 on (test.num = test2.num)
where test2.num is null;
NUM NUM2 NUM3 NUM NUM2 NUM3
---------- ---------- ---------- ---------- ---------- ----------
2 2 2
Execution Plan
----------------------------------------------------------
Plan hash value: 1525288557
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Tim
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 82 | 4 (0)| 00:
|* 1 | FILTER | | | | |
| 2 | NESTED LOOPS OUTER | | 2 | 82 | 4 (0)| 00:
| 3 | TABLE ACCESS FULL | TEST | 2 | 4 | 3 (0)| 00:
| 4 | TABLE ACCESS BY INDEX ROWID| TEST2 | 1 | 39 | 1 (0)| 00:
|* 5 | INDEX RANGE SCAN | ID2 | 1 | | 0 (0)| 00:
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("TEST2"."NUM" IS NULL)
5 - access("TEST"."NUM"="TEST2"."NUM"(+))
I am clearly missing something as the suggested method is proving to be more expensive. Is this due to my data distribution?
Ver: Oracle 10g
Upvotes: 2
Views: 3362
Reputation: 146239
Your two queries return different result sets. The second version also returns data from the second table. I would expect a query which returns data from two tables to be more expensive than a query which returns data from just the one.
Try the second query with select test.*
instead of select *
and see whether it makes a difference.
Upvotes: 3