Reputation: 73
Considering two tables, joined in such a way :
select *
from table_1
left join table_2
on table_1.lnk_id = table_2.lnk_id
;
There is an index on "lnk_id" on table_1, and on table_2. This returns 80 Million rows.
I am using conditional where clause, with variables set by my front-end (APEX):
:all : If=1, should return all rows.
:desired_id : The object I want to return. Can be a null value, in which case I want to return null value rows only.
I coded this first :
select *
from table_1
left join table_2
on table_1.some_id = table_2.some_id
where (
case when :all = 1
then 1
when :desired_id is null and table_2.desired_id is null
then 1
when :desired_id = table_2.desired_id
then 1
else 0
end = 1
)
Given :all = 0 and :desired_id = some non-null value to select the rows the user desires, I experience terrible performance.
I learned I had to avoid "case" in "where" clause, so adapted to :
where (
:all = 1
or (:desired_id is null and table_2.desired_id is null)
or :desired_id = table_2.desired_id
)
No chance, this is as slow as the "case" solution.
I realized this :
where (:desired_id = table_2.desired_id);
--> 0.047s - Super fast
where (:desired_id = table_2.desired_id or 0 = 1);
--> 0.062s - Super fast
where (:desired_id = table_2.desired_id or :all = 1);
--> 235s - Super slow
So I can definitely find the desired object across my 80 M rows in no time, with the where( ... = ... or ... = 1) construction : the optimizer must take the wrong decision when I use :all.
Would anyone be able to guide ?
I would rather avoid the build-a-dynamic-query solution if possible as it makes it a lot more complex to implement and manage I believe and this really sounds like ... it should work with plain SQL.
-- Edit to add the plans --
select *
from table_2
left join table_1
on table_2.lnk_id = table_1.lnk_id
where (:desired_id = table_1.desired_id or 0 = 1);
Plan hash value: 1995399472
--------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 129 | 42183 | 45 (0)| 00:00:01 | | |
| 1 | NESTED LOOPS | | 129 | 42183 | 45 (0)| 00:00:01 | | |
| 2 | NESTED LOOPS | | 138 | 42183 | 45 (0)| 00:00:01 | | |
| 3 | TABLE ACCESS BY INDEX ROWID BATCHED| TABLE_1 | 3 | 435 | 7 (0)| 00:00:01 | | |
|* 4 | INDEX RANGE SCAN | TABLE_1_I1 | 3 | | 3 (0)| 00:00:01 | | |
|* 5 | INDEX RANGE SCAN | TABLE_2_I2 | 46 | | 3 (0)| 00:00:01 | | |
| 6 | TABLE ACCESS BY GLOBAL INDEX ROWID | TABLE_2 | 40 | 7280 | 21 (0)| 00:00:01 | ROWID | ROWID |
--------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("TABLE_1"."DESIRED_ID"=:DESIRED_ID)
5 - access("TABLE_2"."LNK_ID"="TABLE_1"."LNK_ID")
explain plan for
select *
from table_2
left join table_1
on table_2.lnk_id = table_1.lnk_id
where (:desired_id = table_1.desired_id or :p3070100_all = 1);
Plan hash value: 94704160
----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 79M| 24G| | 1441K (1)| 00:00:57 | | |
|* 1 | FILTER | | | | | | | | |
|* 2 | HASH JOIN RIGHT OUTER| | 79M| 24G| 484M| 1441K (1)| 00:00:57 | | |
| 3 | TABLE ACCESS FULL | TABLE_1 | 3238K| 447M| | 19152 (1)| 00:00:01 | | |
| 4 | PARTITION RANGE ALL | | 79M| 13G| | 668K (1)| 00:00:27 | 1 |1048575|
| 5 | TABLE ACCESS FULL | TABLE_2 | 79M| 13G| | 668K (1)| 00:00:27 | 1 |1048575|
----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("TABLE_1"."DESIRED_ID"=:DESIRED_ID OR TO_NUMBER(:P3070100_ALL)=1)
2 - access("TABLE_2"."LNK_ID"="TABLE_1"."LNK_ID"(+))
Thank you, David
Upvotes: 2
Views: 1822
Reputation: 356
When you add or
to your where
clause and the part of that condition is parameter or function, database cannot just simply use an index. It's because there is "another option" and cost based optimiser (CBO) often choose to "FULL TABLE SCAN". You can make your query more simple for CBO - for example:
at application level choose to create different where clause based on :all value,
use some tricks making condition look simple - like instead of using :all parameter use only :desired_id and for getting all results, just pass "null" as value, then you can do something like that:
where table_2.desired_id = nvl(:desired_id, table_2.desired_id)
If there is index on table_2.desired_id, CBO should choose "range scan" or "unique scan" (for unique index).
You should always generate explain plan for your queries and look for "full scans", "nested loops" and "cartesian joins" with large tables - those are things you should avoid.
Update (2019-02-01)
There is 3rd option, when you would like to have "all in one query", so without any additional logic at application level (choosing between 2 queries) or using dynamic SQL. There is an option to make 2 queries in one with union all
and write them in a way that always only one part will be run by database.
Below is improved version of query proposed by JPG:
select *
from table_2
left join table_1 on table_2.lnk_id = table_1.lnk_id
where nvl(:all,2) != 1
and :desired_id = table_2.desired_id
union all
select *
from table_2
left join table_1 on table_2.lnk_id = table_1.lnk_id
where :desired_id = table_2.desired_id
and :all = 1
;
Take into consideration that explain plain could "show" more work for this, than for the other plans, but database should in run-time resolved one of condition as "always false" and query only for one part.
-- Second Update --
Ok, I've now read one of your comments where you've written:
"All" returns every single row, "None" returns all rows with null values for that column, "specific value" returns rows matching that specific value...
Let's say all can take 3 values:
Then the query will be:
-- Specific row
select *
from table_2
left join table_1
on table_2.lnk_id = table_1.lnk_id
where :all = 0
and :desired_id = table_2.desired_id
union all
-- All rows with null
select *
from table_2
left join table_1
on table_2.lnk_id = table_1.lnk_id
where :all = 2
and table_2.desired_id is NULL
union all
-- All rows
select *
from table_2
left join table_1
on table_2.lnk_id = table_1.lnk_id
where :desired_id = table_2.desired_id
and :all = 1
;
But you should know that simple index doesn't work with NULLs. So if you have index on table_2.desired_id:
create index idx_table_2_desired_id on table_2(desired_id);
It will not work, but complex one might do, for ex.
create index idx_table_2_desired_id on table_2(desired_id, 1);
will allow database to search for NULLs in desired_id in such index.
Upvotes: 1
Reputation: 11
As robertus wrote, you should avoid using "OR" in a query which is working on indexed columns I suggest to replace the following query with "OR"
select *
from table_2
left join table_1
on table_2.lnk_id = table_1.lnk_id
where (:desired_id = table_2.desired_id or :all = 1);
by a much more powerful solution in terms of cost.
select *
from table_2
left join table_1
on table_2.lnk_id = table_1.lnk_id
where :desired_id = table_2.desired_id
union
select *
from table_2
left join table_1
on table_2.lnk_id = table_1.lnk_id
where :desired_id <> table_2.desired_id
and :all = 1
;
Upvotes: 1
Reputation: 36107
A glance at the wrong plan explains the cause of the problem, which is here:
1 - filter("TABLE_1"."DESIRED_ID"=:DESIRED_ID OR TO_NUMBER(:P3070100_ALL)=1)
The optimizer is unable to use TABLE_1_I1
index because of this OR condition.
Do not use this bind variable TO_NUMBER(:P3070100_ALL)=1
in the query at all, use dynamic SQL and two version of the query depending on the value of :P3070100
instead.
If :P3070100 <> 1
use this query, which will use the index and will be fast:
select *
from table_2
left join table_1
on table_2.lnk_id = table_1.lnk_id
where :desired_id = table_1.desired_id ;
and when :P3070100 = 1
use this query, which will be slow (because it joins all rows from both tables):
select *
from table_2
left join table_1
on table_2.lnk_id = table_1.lnk_id
Upvotes: 0