David
David

Reputation: 16120

Query with subquery in WHERE clause keeps timing out

I have the following query (slightly amended for clarity):

CREATE PROCEDURE Kctc.CaseTasks_GetCaseTasks
@CaseNumber int
... other parameters
,@ChangedBefore datetime
,@ChangedAfter datetime
AS
SELECT Kctc.CaseTasks.CaseTaskId
  ...blah blah blah
  FROM Kctc.CaseTasks
  ... some joins here
  WHERE  
  ... some normal where clauses
  AND 
  (
    (@ChangedAfter IS NULL AND @ChangedBefore IS NULL)
    OR
    EXISTS (SELECT *
               FROM Kctc.FieldChanges
               WHERE Kctc.FieldChanges.RecordId = Kctc.CaseTasks.CaseTaskId AND 
                     Kctc.FieldChanges.TableName = 'CaseTasks' AND 
                     Kctc.FieldChanges.DateOfChange BETWEEN
                         ISNULL(@ChangedAfter, '2000/01/01') AND
                         ISNULL(@ChangedBefore, '2050/01/01'))
  )

This query times out whenever the user specifies values for @ChangedBefore or @ChangedAfter, therefore invoking the subquery.

The subquery checks for the existence of a record in the table called FieldChanges (which effectively records changes to every field in the CaseTasks table).

Querying FieldChanges isn't very efficient because it involves filtering on the text field TableName which isn't indexed. And I know that subqueries are inherently inefficient.

So my question in general is, is there a way to redesign the query so that it performs better?

I can't think of a way to express the subquery as a join while still returning just one CaseTask row when there are multiple associated FieldChanges (i.e. preserving the EXISTS semantic). I haven't yet indexed the TableName field of the FieldChanges table because I'm hesitant about indexing text fields.

So what should I do?

Upvotes: 1

Views: 2594

Answers (4)

Mark Schultheiss
Mark Schultheiss

Reputation: 34168

My first instinct is to limit the result set

SELECT *
FROM Kctc.FieldChanges
WHERE Kctc.FieldChanges.RecordId = Kctc.CaseTasks.CaseTaskId AND
    Kctc.FieldChanges.TableName = 'CaseTasks' AND 
    Kctc.FieldChanges.DateOfChange BETWEEN
        ISNULL(@ChangedAfter, '2000/01/01') AND
        ISNULL(@ChangedBefore, '2050/01/01'

changed to

SELECT TOP 1 Kctc.FieldChanges.RecordId 
FROM Kctc.FieldChanges
WHERE Kctc.FieldChanges.RecordId = Kctc.CaseTasks.CaseTaskId AND
     Kctc.FieldChanges.TableName = 'CaseTasks' AND 
     Kctc.FieldChanges.DateOfChange BETWEEN
         ISNULL(@ChangedAfter, '2000/01/01') AND
         ISNULL(@ChangedBefore, '2050/01/01'

then look at the index on the fields in the where clause

EDIT: Regarding the TOP 1 - likely not giving that much of a benefit, but should not hurt, and might help avoid a table scan. Using a single field instead of * should only return that column (I assume it is NOT a NULL value column here)

Additional thoughts: Declare and Set a local value instead of the ISNULL thing which gets processed multiple times:

DECLARE @checkmyafter datetime; -- assumption on my part here on the type
SET @checkmyafter = ISNULL(@ChangedAfter, '2000/01/01');

do same with before, then use

...
 SELECT TOP 1 Kctc.FieldChanges.RecordId 
    FROM Kctc.FieldChanges
    WHERE Kctc.FieldChanges.RecordId = Kctc.CaseTasks.CaseTaskId AND
         Kctc.FieldChanges.TableName = 'CaseTasks' AND 
         Kctc.FieldChanges.DateOfChange BETWEEN
             @checkmybefore  AND  @checkmyafter 
...

ONE MORE THING: Check the sequence of the WHERE xxx AND - use the MOST LIKELY candidate to isolate FIRST in the sequence, whichever condition that is, so it can get out faster. If that is the RecordId, then find, if the TableName is better, use that first instead. If one column also has an index already consider that one all else being equal.

Upvotes: 1

David
David

Reputation: 16120

Adding SET ARITHABORT ON to the stored procedure got it to execute in less than 1 second.

I have no idea what it means. Presumably 'stop arsing about'.

Upvotes: 0

Tao
Tao

Reputation: 14006

This is not going to be a "good" solution, but it might be better than what's happening right now:

SELECT Kctc.CaseTasks.CaseTaskId
  ...blah blah blah
  FROM Kctc.CaseTasks
  ... some joins here
  LEFT JOIN (
     SELECT RecordID
     FROM Kctc.FieldChanges
     WHERE Kctc.FieldChanges.TableName = 'CaseTasks'
     AND Kctc.FieldChanges.DateOfChange BETWEEN
                         ISNULL(@ChangedAfter, '2000/01/01') AND
                         ISNULL(@ChangedBefore, '2050/01/01')
     GROUP BY RecordID
  ) AS MatchingChanges ON Kctc.CaseTasks.RecordId = MatchingChanges.RecordId
  WHERE  
  ... some normal where clauses
     AND (MatchingChanges.RecordID Is Not Null OR ((@ChangedAfter IS NULL AND @ChangedBefore IS NULL))

Depends on exactly what the query plan is - if it was performing the subquery repeatedly, this formulation might help.

Upvotes: 0

As a first cut you might try putting an index on the table Kctc.FieldChanges on the RecordId, TableName, and DateOfChange fields (one single index with all three fields) and see if that helps.

Share and enjoy.

Upvotes: 1

Related Questions