Reputation: 9616
I'm running what I think is a fairly simple query, and I'm having getting absolutely atrocious return times on the query. When running the query, a sub query as a filter in an IN
predicate is performing an Index scan and an Index Spool when used as part of the query, but runs an Index Seek when used outside of the IN
predicate. I have no idea why, but the query is taking nearly 30 seconds to return zero records...
Here's the query:
SELECT DISTINCT
C.County
, S.State
, C.County_ID
FROM
Leads L
INNER JOIN Inventory I ON L.Deleted = 0 AND L.Inv_ID = I.Inv_ID
INNER JOIN County C ON C.County_ID = I.County_ID
INNER JOIN State S ON C.State_ID = S.State_ID AND S.Active = 1
INNER JOIN Contacts ON L.Contact_ID = Contacts.ID AND Contacts.Deleted = 0
WHERE
L.Acct_ID = 204940
OR L.Acct_ID IN (
SELECT Accounts.Acct_ID FROM Accounts (NOLOCK) WHERE Accounts.Parent_Acct_ID = 204940
)
ORDER BY
S.State
, C.County
Here's the relevant part of the execution plan of that query. It's running a scan on the Accounts table in the WHERE
clause, with an estimated number of rows to be returned of over 260,000. The subselect actually returns 0 records.
When I run that simple select outside the above query, I get a straight index seek, against the exact same Index. The subselect returns 0 records. If I run the query without the OR
statement I get ms response times, if I run the query with the OR
statement, the query takes nearly 30 seconds to complete.
Here's the index that's being used for both the Scan and the Seek on the Accounts table (I know, it's a dta index, not my original database...)
CREATE NONCLUSTERED INDEX [_dta_index_Accounts_5_144719568__K32_50] ON [dbo].[Accounts]
(
[parent_Acct_ID] ASC
)
INCLUDE ( [CompanyState]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
How do I get the subselect to run an Index Seek instead of a Scan, and return in any sort of reasonable amount of time?
Upvotes: 1
Views: 458
Reputation: 35726
I've simplified your query a little and changed the sub-query into a left join.
Give it a go, it should be easier for the engine to optimize or, if not, easier for you to optimize.
Before optimizing ensure your statistics are up to date.
SELECT
[C].[County],
[S].[State],
[C].[County_ID]
FROM
[dbo].[Leads] [L]
JOIN
[dbo].[Inventory] [I]
ON [I].[Inv_ID] = [L].[Inv_ID]
JOIN
[dbo].[County] [C]
ON [C].[County_ID] = [I].[County_ID]
JOIN
[dbo].[State] [S]
ON [S].[State_ID] = [C].[State_ID]
JOIN
[dbo].[Contacts] [Co]
ON [Co].[ID] = [L].[Contact_ID]
LEFT JOIN
[dbo].[Accounts] [A] (NOLOCK)
ON [A].[Acct_ID] = [L].[Acct_ID]
WHERE
[L].[Deleted] = 0
AND
[S].[Active] = 1
AND
[Co].[Deleted] = 0
AND
(
[L].[Acct_ID] = 204940
OR
[A].[Parent_Acct_ID] = 204940
)
GROUP BY
[C].[County],
[S].[State],
[C].[County_ID]
ORDER BY
[S].[State],
[C].[County]
given the extra information in the OP's own answer, the query could be further simplified,
SELECT
[C].[County],
[S].[State],
[C].[County_ID]
FROM
[dbo].[Leads] [L]
JOIN
[dbo].[Inventory] [I]
ON [I].[Inv_ID] = [L].[Inv_ID]
JOIN
[dbo].[County] [C]
ON [C].[County_ID] = [I].[County_ID]
JOIN
[dbo].[State] [S]
ON [S].[State_ID] = [C].[State_ID]
JOIN
[dbo].[Contacts] [Co]
ON [Co].[ID] = [L].[Contact_ID]
JOIN
[dbo].[Accounts] [A] (NOLOCK)
ON [A].[Acct_ID] = [L].[Acct_ID]
OR [A].[Parent_Acct_ID] = [L].[Acct_ID]
WHERE
[L].[Deleted] = 0
AND
[S].[Active] = 1
AND
[Co].[Deleted] = 0
AND
[L].[Acct_ID] = 204940
GROUP BY
[C].[County],
[S].[State],
[C].[County_ID]
ORDER BY
[S].[State],
[C].[County]
Upvotes: 1
Reputation: 9616
I upvoted all of you who gave me a hand, but this ended up being the solution:
SELECT DISTINCT
C.County
, S.State
, C.County_ID
FROM
Leads L
INNER JOIN Contacts ON L.Contact_ID = Contacts.ID AND Contacts.Deleted = 0
INNER JOIN Inventory I ON L.Deleted = 0 AND L.Inv_ID = I.Inv_ID
INNER JOIN Accounts ON L.Acct_ID = Accounts.Acct_ID
AND ( Accounts.Acct_ID = 204940 OR Accounts.parent_Acct_ID = 204940 )
INNER JOIN County C ON C.County_ID = I.County_ID
INNER JOIN State S ON C.State_ID = S.State_ID AND S.Active = 1
ORDER BY
S.State
, C.County
Moving to a JOIN instead of a sub-select caused the query to run in 8ms instead of over 30 seconds. I know have straight index seeks in a nice cascade instead of a bunch of backwards filters and sorts to figure out who belongs in the query.
Upvotes: 0
Reputation: 75
Try this one:
SELECT DISTINCT
C.County
, S.State
, C.County_ID
FROM
Leads L
INNER JOIN Inventory I ON L.Deleted = 0 AND L.Inv_ID = I.Inv_ID (AND L.Acct_ID = 204940
OR L.Acct_ID IN (
SELECT Accounts.Acct_ID FROM Accounts (NOLOCK) WHERE Accounts.Parent_Acct_ID = 204940
))
INNER JOIN County C ON C.County_ID = I.County_ID
INNER JOIN State S ON C.State_ID = S.State_ID AND S.Active = 1
INNER JOIN Contacts ON L.Contact_ID = Contacts.ID AND Contacts.Deleted = 0
ORDER BY
S.State
, C.County
Upvotes: 1