Reputation: 5252
I have a job I need to automate to make sure a cache for some entities in my database is populated. I have the query below using CTE
and CROSS JOIN
but it doesn't run very quickly so I'm sure it can be improved.
The issue:
The goal:
I want to set up a job to take a group of component Ids for 'this months report' and pre-cache the data if it doesn't exist.
I need to get a list of employees and the components they are missing in the cache for this months report. I will then set up a CRON job to process the queue.
The Question
My query below is slow - Is there a more efficient way to return a list of employees and the component ids that are missing in the cache?
The current SQL:
declare @reportDate datetime2 = '2019-10-01'; //the report publish date
declare @componentIds table (id int); // store the ids of each cachable component
insert @componentIds(id) values(1),(2),(3),(4),(5);
;WITH cteCounts
AS (SELECT r.Id as reportId, cs.componentId,
COUNT(1) AS ComponentCount
FROM EmployeeReports r
LEFT OUTER JOIN CacheStore cs on r.Id = cs.reportId and cs.componentId in (SELECT id FROM @componentIds)
GROUP BY r.Id, cs.componentId)
SELECT e.Id, e.name, _c.id as componentId, r.Id as reportId
FROM Employees e
INNER JOIN EmployeeReports r on e.Id = r.employeeId and r.reportDate = @reportDate
CROSS JOIN @componentIds _c
LEFT OUTER JOIN cteCounts AS cn
ON _c.Id = cn.componentId AND r.Id = cn.reportId
WHERE cn.ComponentCount is null
Upvotes: 2
Views: 58
Reputation: 14199
2 things I can suggest doing:
Use NOT EXISTS
instead of a LEFT JOIN + IS NULL
. The execution plan is prone to be different when you tell the engine that you want records that don't have any occurrence in a particular set Vs. joining and making sure that the joined column is null.
SELECT e.Id, e.name, _c.id as componentId, r.Id as reportId
FROM Employees e
INNER JOIN EmployeeReports r on e.Id = r.employeeId and r.reportDate = @reportDate
CROSS JOIN @componentIds _c
WHERE
NOT EXISTS (SELECT 'no record' FROM cteCounts AS cn
WHERE _c.Id = cn.componentId AND r.Id = cn.reportId)
Use temporary tables instead of CTE and/or variable tables. If you have to handle many rows, variable tables don't actually have statistics on and some complex CTE's might actually make lousy execution plans. Try using temporary tables instead of these 2 and see if the performance boosts. Also try creating relevant indexes on them if your row count is high.
Upvotes: 2