scgough
scgough

Reputation: 5252

TSQL - Return all record ids and the missing id's we need to populate

I have a job I need to automate to make sure a cache for some entities in my database is populated. I have the query below using CTE and CROSS JOIN but it doesn't run very quickly so I'm sure it can be improved.

The issue:

The goal:

I want to set up a job to take a group of component Ids for 'this months report' and pre-cache the data if it doesn't exist.

I need to get a list of employees and the components they are missing in the cache for this months report. I will then set up a CRON job to process the queue.

The Question

My query below is slow - Is there a more efficient way to return a list of employees and the component ids that are missing in the cache?

The current SQL:

declare @reportDate datetime2 = '2019-10-01'; //the report publish date
declare @componentIds table (id int);         // store the ids of each cachable component
insert @componentIds(id) values(1),(2),(3),(4),(5);   

;WITH cteCounts 
AS (SELECT r.Id as reportId, cs.componentId,
   COUNT(1) AS ComponentCount
   FROM EmployeeReports r
   LEFT OUTER JOIN CacheStore cs on r.Id = cs.reportId and cs.componentId in (SELECT id FROM @componentIds)
   GROUP BY r.Id, cs.componentId)

SELECT e.Id, e.name, _c.id as componentId, r.Id as reportId
FROM Employees e
INNER JOIN EmployeeReports r on e.Id = r.employeeId and r.reportDate = @reportDate 
CROSS JOIN @componentIds _c
LEFT OUTER JOIN cteCounts AS cn
    ON _c.Id = cn.componentId AND r.Id = cn.reportId
WHERE cn.ComponentCount is null

Upvotes: 2

Views: 58

Answers (1)

EzLo
EzLo

Reputation: 14199

2 things I can suggest doing:

  • Use NOT EXISTS instead of a LEFT JOIN + IS NULL. The execution plan is prone to be different when you tell the engine that you want records that don't have any occurrence in a particular set Vs. joining and making sure that the joined column is null.

    SELECT e.Id, e.name, _c.id as componentId, r.Id as reportId
    FROM Employees e
    INNER JOIN EmployeeReports r on e.Id = r.employeeId and r.reportDate = @reportDate 
    CROSS JOIN @componentIds _c
    WHERE
        NOT EXISTS (SELECT 'no record' FROM cteCounts AS cn
                    WHERE _c.Id = cn.componentId AND r.Id = cn.reportId)
    
  • Use temporary tables instead of CTE and/or variable tables. If you have to handle many rows, variable tables don't actually have statistics on and some complex CTE's might actually make lousy execution plans. Try using temporary tables instead of these 2 and see if the performance boosts. Also try creating relevant indexes on them if your row count is high.

Upvotes: 2

Related Questions