Bhaskar
Bhaskar

Reputation: 1690

Why using OR condition instead of Union caused a performance Issue

Hi I have below query in an SP

@CrmContactId is a parameter to the SP.

Select distinct A.PolicyBusinessId, A.PolicyDetailId 
from TPolicyBusiness A
inner join TPolicyOwner B on a.PolicyDetailId=b.PolicyDetailId
Left Join TAdditionalOwner C on c.PolicyBusinessId=A.PolicyBusinessId
where (b.CRMContactId = @CRMContactId)  

we made a new change and introduced an OR condition

Select distinct A.PolicyBusinessId, A.PolicyDetailId 
from TPolicyBusiness A
inner join TPolicyOwner B on a.PolicyDetailId=b.PolicyDetailId
Left Join TAdditionalOwner C on c.PolicyBusinessId=A.PolicyBusinessId
where (b.CRMContactId = @CRMContactId OR C.CRMContactId = @CRMContactId) 

Execution plan: enter image description here

Attachment : Execution Plan

but this change caused a huge performance issue in live server. TPolicyBusiness and TPolicyOwner are heavy tables having millions of records. TAdditionalOwner table is a light table with only few records.

To fix this issue Union all was instead of OR condition.

Select distinct A.PolicyBusinessId, A.PolicyDetailId 
        From
        (
            Select A.PolicyBusinessId, A.PolicyDetailId 
            from TPolicyBusiness A
            inner join TPolicyOwner B on a.PolicyDetailId=b.PolicyDetailId
            where b.CRMContactId = @CRMContactId
            union all
            Select A.PolicyBusinessId, A.PolicyDetailId 
            from TPolicyBusiness A
            Join TAdditionalOwner C on c.PolicyBusinessId=A.PolicyBusinessId
            where C.CRMContactId = @CRMContactId
        ) as A

Execution Plan: enter image description here

Attachement Execution Plan

Could someone please explain why the introduction of OR caused an issue and why using Union is better than OR in this case?

Upvotes: 3

Views: 4023

Answers (4)

Remus Rusanu
Remus Rusanu

Reputation: 294307

Using UNION ALL to replace OR is actually one of the well known optimization tricks. The best reference and explanation is in this article: Index Union.

The gist of it is that OR predicates that could be be satisfied by two index seeks cannot be reliably detected by the query optimizer (the reason being impossibility to predict the disjoint sets from the two sides of the OR). So when expressing the same condition as an UNION ALL then the optimizer has no problem creating a plan that does two short seeks and unions the results. The important thing is to realize that a=1 or b=2 can be different from a=1 union all b=2 because the first query returns rows that satisfy both conditions once, while the later returns them twice. When you write the query as UNION ALL you are telling the compiler that you understand that and you have no problem with it.

For further reference see How to analyse SQL Server performance.

Upvotes: 10

simo.3792
simo.3792

Reputation: 2236

The JOIN to TAdditionalOwner is using

TPolicyBusiness.PolicyBusinessId = TAdditionalOwner.PolicyBusinessId 

where the JOIN to TPolicyOwner is using the

TPolicyBusiness.PolicyDetailId = TPolicyOwner.PolicyDetailId

Check that there is a corresponding index for the PolicyBusinessId.

In the 2-way JOIN, that is part of the UNION, the smaller TAdditionalOwner table will be optimised if there is not an index for it to refer to in TPolicyBusiness, due to the small size. The server will still do a table scan, but use the values from the smaller table and see if they are in the big table somewhere. If there is no index, this optimisation will disappear quite quickly as the small table grows.

Given that you are not referring to either B or C in the SELECT you can simply to this

SELECT DISTINCT A.PolicyBusinessId, A.PolicyDetailId 
FROM TPolicyBusiness A
LEFT JOIN TPolicyOwner B ON a.PolicyDetailId = b.PolicyDetailId AND b.CRMContactId = @CRMContactId 
LEFT JOIN TAdditionalOwner C on c.PolicyBusinessId = A.PolicyBusinessId AND C.CRMContactId = @CRMContactId

This way it will JOIN to either table, same as per your UNION, but without the OUTER select.

Either way ensure that the fields used are indexed.

Upvotes: 0

Adi
Adi

Reputation: 232

Union operator, it will take lot of time if the data is huge.try to avoid union operator.

please try below try to filter the data while joining.it will be much better.

Select distinct A.PolicyBusinessId, A.PolicyDetailId 
from TPolicyBusiness A
inner join TPolicyOwner B on a.PolicyDetailId=b.PolicyDetailId
and b.CRMContactId = @CRMContactId 
Left Join TAdditionalOwner C on c.PolicyBusinessId=A.PolicyBusinessId
where  (1=1 OR C.CRMContactId = @CRMContactId) 

Upvotes: -1

Paul Maxwell
Paul Maxwell

Reputation: 35593

Query1, the left join and the where clause combine to mean that table C is effectively ignored (not referenced in select list either), so you basically have a 2 table inner join query.

Query 2, the left join in now almost an inner join because that table is referenced in the where clause without also permitting NULLs from that table - but due the OR, all conditions are being considered for the result. Hence a 3 table inner join query more or less.

Query 3, you have simplified the inner UNION ALL query to 2 simple inner joins

But I suspect there will be more to it because and A.IndigoClientId=@TenantId indicates you have not revealed the full queries anyway.

Upvotes: 0

Related Questions