Reputation: 1985
I have two queries both give the same result, but one is faster than the other
Query one
SELECT
MP.MilestonePlanningID, SUM(TS.UnitsUsed) TotalUnits
FROM
MilestonePlanning MP
INNER JOIN
Timesheet TS ON MP.MilestonePlanningID = TS.MilestonePlanningID
INNER JOIN
PlanningAction PA ON MP.LastPlanningActionID = PA.ActionID
WHERE
PA.ActionNameID = 4
AND PA.ActionDateTime >= DATEADD(MONTH,-1,GETDATE())
GROUP BY
MP.MilestonePlanningID
Query two
SELECT MP.MilestonePlanningID, SUM(TS.UnitsUsed) TotalUnits
FROM MilestonePlanning MP
INNER JOIN Timesheet TS ON MP.MilestonePlanningID = TS.MilestonePlanningID
INNER JOIN PlanningAction PA ON MP.LastPlanningActionID = PA.ActionID
WHERE PA.ActionNameID = 4
AND CAST(PA.ActionDateTime AS TIMESTAMP) >= CAST(DATEADD(MONTH,-1,GETDATE()) AS TIMESTAMP)
GROUP BY MP.MilestonePlanningID
Query one is the most obvious one to use and it is the one I started BUT it takes about 9 - 15 seconds to execute, however it remains instant when I remove the following
AND PA.ActionDateTime >= DATEADD(MONTH,-1,GETDATE())
Query two is instant with and without the following line
AND CAST(PA.ActionDateTime AS TIMESTAMP) >= CAST(DATEADD(MONTH,-1,GETDATE()) AS TIMESTAMP)
Query two is also not the most obvious way to look at data from last month to today. can anyone explain to me what is happening here, and what I am doing wrong in query one
Upvotes: 2
Views: 157
Reputation: 69759
Impossible to say for sure without the execution plans, but my guess would be that the statistics are a bit out of date, as can happen with ascending keys, so the cardinality estimation is wrong. When you include this line:
PA.ActionDateTime >= DATEADD(MONTH,-1,GETDATE())
I suspect SQL Server determines very few rows will be returned, so opts for a plan based on this, perhaps a nested loop join somewhere, or a bookmark lookup, where as a wider plan should be generated, using a hash/merge join or clustered index scan respectively.
The reason the second query works instantly with and without the date filter, is because the cast to timestamp, which is basically a cast to binary means SQL Server is unable to use the statistics on ActionDateTime
, and is therefore forced into a wider plan because it is unable to determine a reasonable estimate for the number of matching rows.
Edit
Having seen the plans my suspicions are confirmed. The cardinality estimate is out, SQL server assumes incorrectly that there are very few rows in PlanningAction
with ActionDateTime
in the last month, it therefore determines that the best
way to go about gettting the rows is to get all of the matching rows from PlanningAction
, then for each of these rows go into the other two tables and get the data. For a few rows this is more efficient than first getting the data from MilestonePlanning
and
Timesheet
then removing most of the data. For a lot of rows this is hugely inefficient.
With no where clause SQL Server realises that this will be a poor choice of plan, so uses hash matches instead of nested loop joins, which is much better suited to handling more rows of data.
The fix is to simply update your statistics, and probably create a maintenance plan to ensure the statistics are updated more frequently. Or to force the MERGE JOIN
:
SELECT MP.MilestonePlanningID, SUM(TS.UnitsUsed) TotalUnits
FROM MilestonePlanning MP
INNER MERGE JOIN Timesheet TS ON MP.MilestonePlanningID = TS.MilestonePlanningID
INNER MERGE JOIN PlanningAction PA ON MP.LastPlanningActionID = PA.ActionID
WHERE PA.ActionNameID = 4 AND PA.ActionDateTime >= DATEADD(MONTH,-1,GETDATE())
GROUP BY MP.MilestonePlanningID;
You should use this with caution, because someone may come along and only want to query the last hour, where the nested loop join may be more efficient, but only change the DATEADD
bit, and you may again have a performance problem.
You could also an index to assist with the queries, but if they are already instant it is probably not worth the extra maintenance cost of having an index.
Upvotes: 0
Reputation: 1269633
For this query:
SELECT MP.MilestonePlanningID, SUM(TS.UnitsUsed) TotalUnits
FROM MilestonePlanning MP INNER JOIN
Timesheet TS
ON MP.MilestonePlanningID = TS.MilestonePlanningID INNER JOIN
PlanningAction PA
ON MP.LastPlanningActionID = PA.ActionID
WHERE PA.ActionNameID = 4 AND PA.ActionDateTime >= DATEADD(MONTH,-1,GETDATE())
GROUP BY MP.MilestonePlanningID;
I would suggest indexes on PlanningAction(ActionNameId, ActionDateTime, ActionId)
, MilestonePlanning(LastPlanningActionID, MilestonePlanningID)
and Timesheet(MilestonePlanningID)
.
The best guess (without an execution plan) is that SQL Server is making a poor choice of indexes in the first query. Timestamp
is not really something you should be casting to. It is used for row versioning. No doubt, this prevents the use of an index on ActionDateTime
, which probably helps the query.
Upvotes: 2