Reputation: 7402
We utilize SSIS (SQL/BIDS 2008) to generate close to 1000 reports per day. The same package is scheduled 5 times in SQL Agent, on staggered schedules.
The SSIS queries a Report
table, and if the report hasn't run yet on the given date (check ReportLog
) and that report is not running currently (check WorkQueue
) the first (top) report is picked up, and ran.
That query is effectively:
SELECT TOP 1 R.ReportID
FROM Report R
LEFT JOIN ReportLog L ON R.ReportID=L.ReportID AND L.RunDate >= CONVERT(DATE,GETDATE())
LEFT JOIN WorkQueue Q ON R.ReportID=Q.ReportID
WHERE R.Active=1
AND L.ReportID IS NULL
AND Q.ReportID IS NULL
The SSIS package, upon selecting this TOP 1 ReportID
(stored in a SSIS variable), inserts the ReportID
into WorkQueue
, so no other instances of the package will try to use that ReportID
. The insert happens in the next step after the select.
Most times, most days, this works fine. However, every so often two SSIS packages end up running at (within a few milliseconds) the exact same time, and both return the same TOP 1 ReportID
causing issues with one another as they both execute the same procedures, and manipulate the same underlying data.
Right now, we're considering implementing a parent SSIS package which does the select
and places the ReportID
into a "buffer" table along with the Job# of one of the 5 child packages to pick up (rather than the child packages all doing their own select top 1
), but that seems very hacky.
Another option we considered was an instead of insert
trigger on the WorkQueue
table which would raise an error on duplicate insertions.
I am wondering if there is any other way within SSIS to prevent this situation without too much redesign.
Upvotes: 1
Views: 186
Reputation: 61249
How about avoid the trigger and make use of the OUTPUT clause?
INSERT INTO
WorkQueue
OUTPUT Inserted.ReportID
SELECT TOP 1 R.ReportID
FROM Report R
LEFT JOIN ReportLog L ON R.ReportID=L.ReportID AND L.RunDate >= CONVERT(DATE,GETDATE())
LEFT JOIN WorkQueue Q ON R.ReportID=Q.ReportID
WHERE R.Active=1
AND L.ReportID IS NULL
AND Q.ReportID IS NULL;
This should allow you to pop row row into the workQueue table to claim it for the current process and simultaneously retrieve the value for SSIS to use.
Upvotes: 2