Adam Levitt
Adam Levitt

Reputation: 10476

SQL AND Constraint Causing Performance Issue

I have a seemingly simple query case where adding a constraint causes a major performance decrease. The three columns that are being constrained with the AND clause, are all bigints. If I use any two of them (but not all three) the query runs instantaneously, but as soon as I add a third AND, it runs slow.

WITH tb AS (SELECT
   DISTINCT u.*  
FROM
   [user] u  
INNER JOIN
   user_personal up 
      ON up.user_id = u.user_id        
WHERE
   1=1  
   AND u.site_instance_id = 1      
   AND u.graduation_class_id = 27  
   AND u.graduation_term_id IN (76,75) 
   ) SELECT
   COUNT (*) AS count 
FROM
   ( SELECT
      ROW_NUMBER() OVER (
   ORDER BY
      last_name ASC) AS row,
      * 
   FROM
      tb) sub

Does this have to do with the fact that all three of those columns are bigint? Or does it have to do with table indexes on those columns? (I don't have any indexes set up for those columns). Or could it be something else?

Note - in this case the AND u.site_instance_id = 1 is redundant, but it shouldn't matter, should it?

EDIT After using SET showplan_all ON:

 |--Compute Scalar(DEFINE:([Expr1005]=CONVERT_IMPLICIT(int,[Expr1008],0)))
   |--Stream Aggregate(DEFINE:([Expr1008]=Count(*)))
        |--Nested Loops(Left Semi Join, WHERE:(.[dbo].[user].[user_id] as [u].[user_id]=.[dbo].[user_personal].[user_id] as [up].[user_id]))
             |--Clustered Index Scan(OBJECT:(.[dbo].[user].[PK__user__B9BE370F7F60ED59] AS [u]), WHERE:(.[dbo].[user].[site_instance_id] as [u].[site_instance_id]=(1) AND .[dbo].[user].[graduation_class_id] as [u].[graduation_class_id]=(27) AND (.[dbo].[user].[graduation_term_id] as [u].[graduation_term_id]=(75) OR .[dbo].[user].[graduation_term_id] as [u].[graduation_term_id]=(76))))
             |--Clustered Index Scan(OBJECT:(.[dbo].[user_personal].[PK__user_per__C701FAD641EDCAC5] AS [up]))

... and with only TWO AND clauses...

|--Compute Scalar(DEFINE:([Expr1005]=CONVERT_IMPLICIT(int,[Expr1008],0)))
       |--Stream Aggregate(DEFINE:([Expr1008]=Count(*)))
            |--Hash Match(Left Semi Join, HASH:([u].[user_id])=([up].[user_id]), RESIDUAL:(.[dbo].[user].[user_id] as [u].[user_id]=.[dbo].[user_personal].[user_id] as [up].[user_id]))
                 |--Clustered Index Scan(OBJECT:(.[dbo].[user].[PK__user__B9BE370F7F60ED59] AS [u]), WHERE:(.[dbo].[user].[graduation_class_id] as [u].[graduation_class_id]=(27) AND (.[dbo].[user].[graduation_term_id] as [u].[graduation_term_id]=(75) OR .[dbo].[user].[graduation_term_id] as [u].[graduation_term_id]=(76))))
                 |--Clustered Index Scan(OBJECT:(.[dbo].[user_personal].[PK__user_per__C701FAD641EDCAC5] AS [up]))

Upvotes: 1

Views: 130

Answers (2)

sgeddes
sgeddes

Reputation: 62831

I'm still not sure I understand your question -- @Hogan had a much simpler version of your query. Nonetheless, you should have Indexes on each of the fields you join on and any field you consistently search upon.

In your case, I'd make sure you have the following indexes:

  • user.user_id (presumably already has PK)
  • user_personal.user_id
  • user.site_instance_id, user.graduation_class_id and user.graduation_term_id

You may also consider adding indexes to each of the 3 fields in your WHERE criteria individually, but this should prove to yield a better performance. Plus, use @Hogan's query instead.

Good luck.

Upvotes: 2

Hogan
Hogan

Reputation: 70513

There must be more to the story. Clearly the following is the same as the code posted and simpler (and I expect faster):

  SELECT COUNT(DISTINCT u.user_id)) as count
  FROM [user] u  
  INNER JOIN  user_personal up ON up.user_id = u.user_id        
  WHERE
   u.site_instance_id = 1   AND   
   u.graduation_class_id = 27  AND
   u.graduation_term_id IN (76,75) 

Upvotes: 1

Related Questions