Dan Short
Dan Short

Reputation: 9616

Table Scan and Eager Spools on IN Predicate

I'm running what I think is a fairly simple query, and I'm having getting absolutely atrocious return times on the query. When running the query, a sub query as a filter in an IN predicate is performing an Index scan and an Index Spool when used as part of the query, but runs an Index Seek when used outside of the IN predicate. I have no idea why, but the query is taking nearly 30 seconds to return zero records...

Here's the query:

SELECT DISTINCT
    C.County
    , S.State
    , C.County_ID
FROM
    Leads L 
    INNER JOIN Inventory I ON L.Deleted = 0 AND L.Inv_ID = I.Inv_ID
    INNER JOIN County C ON C.County_ID = I.County_ID
    INNER JOIN State S ON C.State_ID = S.State_ID AND S.Active = 1
    INNER JOIN Contacts ON L.Contact_ID = Contacts.ID AND Contacts.Deleted = 0
WHERE
    L.Acct_ID = 204940
    OR L.Acct_ID IN (
        SELECT Accounts.Acct_ID FROM Accounts (NOLOCK) WHERE Accounts.Parent_Acct_ID = 204940
    )
ORDER BY
    S.State
    , C.County

Here's the relevant part of the execution plan of that query. It's running a scan on the Accounts table in the WHERE clause, with an estimated number of rows to be returned of over 260,000. The subselect actually returns 0 records.

enter image description here

When I run that simple select outside the above query, I get a straight index seek, against the exact same Index. The subselect returns 0 records. If I run the query without the OR statement I get ms response times, if I run the query with the OR statement, the query takes nearly 30 seconds to complete.

Here's the index that's being used for both the Scan and the Seek on the Accounts table (I know, it's a dta index, not my original database...)

CREATE NONCLUSTERED INDEX [_dta_index_Accounts_5_144719568__K32_50] ON [dbo].[Accounts] 
(
    [parent_Acct_ID] ASC
)
INCLUDE ( [CompanyState]) WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

How do I get the subselect to run an Index Seek instead of a Scan, and return in any sort of reasonable amount of time?

Upvotes: 1

Views: 458

Answers (3)

Jodrell
Jodrell

Reputation: 35726

I've simplified your query a little and changed the sub-query into a left join.

Give it a go, it should be easier for the engine to optimize or, if not, easier for you to optimize.

Before optimizing ensure your statistics are up to date.

SELECT
            [C].[County],
            [S].[State],
            [C].[County_ID]
    FROM
            [dbo].[Leads] [L] 
        JOIN
            [dbo].[Inventory] [I]
                ON [I].[Inv_ID] = [L].[Inv_ID]
        JOIN
            [dbo].[County] [C]
                ON [C].[County_ID] = [I].[County_ID]
        JOIN
            [dbo].[State] [S]
                ON [S].[State_ID] = [C].[State_ID]
        JOIN
            [dbo].[Contacts] [Co]
                ON [Co].[ID] = [L].[Contact_ID]
        LEFT JOIN
            [dbo].[Accounts] [A] (NOLOCK)
                ON [A].[Acct_ID] = [L].[Acct_ID]
    WHERE
            [L].[Deleted] = 0
        AND
            [S].[Active] = 1
        AND
            [Co].[Deleted] = 0
        AND
            (
                [L].[Acct_ID] = 204940
            OR
                [A].[Parent_Acct_ID] = 204940
            )
    GROUP BY
            [C].[County],
            [S].[State],
            [C].[County_ID]
    ORDER BY
            [S].[State],
            [C].[County]

given the extra information in the OP's own answer, the query could be further simplified,

SELECT
            [C].[County],
            [S].[State],
            [C].[County_ID]
    FROM
            [dbo].[Leads] [L] 
        JOIN
            [dbo].[Inventory] [I]
                ON [I].[Inv_ID] = [L].[Inv_ID]
        JOIN
            [dbo].[County] [C]
                ON [C].[County_ID] = [I].[County_ID]
        JOIN
            [dbo].[State] [S]
                ON [S].[State_ID] = [C].[State_ID]
        JOIN
            [dbo].[Contacts] [Co]
                ON [Co].[ID] = [L].[Contact_ID]
        JOIN
            [dbo].[Accounts] [A] (NOLOCK)
                ON [A].[Acct_ID] = [L].[Acct_ID]
                    OR [A].[Parent_Acct_ID] = [L].[Acct_ID] 
    WHERE
            [L].[Deleted] = 0
        AND
            [S].[Active] = 1
        AND
            [Co].[Deleted] = 0
        AND
            [L].[Acct_ID] = 204940
    GROUP BY
            [C].[County],
            [S].[State],
            [C].[County_ID]
    ORDER BY
            [S].[State],
            [C].[County]

Upvotes: 1

Dan Short
Dan Short

Reputation: 9616

I upvoted all of you who gave me a hand, but this ended up being the solution:

SELECT DISTINCT
    C.County
    , S.State
    , C.County_ID
FROM
    Leads L 
    INNER JOIN Contacts ON L.Contact_ID = Contacts.ID AND Contacts.Deleted = 0
    INNER JOIN Inventory I ON L.Deleted = 0 AND L.Inv_ID = I.Inv_ID
    INNER JOIN Accounts ON L.Acct_ID = Accounts.Acct_ID
        AND ( Accounts.Acct_ID = 204940 OR Accounts.parent_Acct_ID = 204940 )
    INNER JOIN County C ON C.County_ID = I.County_ID
    INNER JOIN State S ON C.State_ID = S.State_ID AND S.Active = 1
ORDER BY
    S.State
    , C.County

Moving to a JOIN instead of a sub-select caused the query to run in 8ms instead of over 30 seconds. I know have straight index seeks in a nice cascade instead of a bunch of backwards filters and sorts to figure out who belongs in the query.

Upvotes: 0

Clevemayer
Clevemayer

Reputation: 75

Try this one:

   SELECT DISTINCT
        C.County
        , S.State
        , C.County_ID
    FROM
        Leads L 
        INNER JOIN Inventory I ON L.Deleted = 0 AND L.Inv_ID = I.Inv_ID (AND  L.Acct_ID = 204940
        OR L.Acct_ID IN (
            SELECT Accounts.Acct_ID FROM Accounts (NOLOCK) WHERE Accounts.Parent_Acct_ID = 204940
        ))
        INNER JOIN County C ON C.County_ID = I.County_ID
        INNER JOIN State S ON C.State_ID = S.State_ID AND S.Active = 1
        INNER JOIN Contacts ON L.Contact_ID = Contacts.ID AND Contacts.Deleted = 0
    ORDER BY
        S.State
        , C.County

Upvotes: 1

Related Questions