Mark Struzinski
Mark Struzinski

Reputation: 33501

SSIS Job Monitoring and Reporting

Our shop relies heavily on SSIS to run our back end processes and database tasks. Overall, we have hundreds of jobs, and for the most part, everything runs efficiently and smoothly.

Most of the time, we have a job failure due to an external dependency failing (data not available, files not delivered, etc). Right now, our process is set up to email us every time a job fails. SSIS will generate an email sending us the name of the job and the step it failed on.

I'm looking at creating a dashboard of sorts to monitor this process more efficiently. I know that the same information available in the Job History window from SSIS is also available by querying the msdb database. I want to set up a central location to report failures (probably using SQL Reporting Services), and also a more intelligent email alert system.

Has anyone else dealt with this issue? If so, what kind of processes/reporting did you create around the SSIS procedures to streamline notification of job failures or alerts?

Upvotes: 4

Views: 6405

Answers (3)

user11011557
user11011557

Reputation: 1

Current SSRS Job monitoring Process: There are no SSRS job monitoring process. If any SSRS job is failed, user creates the incident, then TOPS Reporting and SSRS developer team are started to work on basis of incident. As a result, this process has taken huge turnaround time to resolve this issue.

Proposed SSRS Job monitoring Process: SSRS subscriptions monitoring job will help to TOPS Reporting and SSRS developer for proactively monitoring the SSRS job. This job will create the report to display the failed list of report along with generic execution log status and subscription errors log status. Initially, developer can understand report failure reason from this report and then developer can start working proactively to resolve the issue.

Upvotes: 0

Shane H
Shane H

Reputation: 3293

If you're not looking to do a total custom build out, you can use https://cronitor.io to monitor etl jobs.

Upvotes: 1

Registered User
Registered User

Reputation: 8395

We have a similar setup at our company. We primarily rely on letting the jobs notify us when there is a problem and we have employees who check job statuses at specific times to ensure that everything is working properly and nothing was overlooked.

My team receives a SQL Server Agent Job Activity Report HTML email every morning at 6am and 4pm that lists all failed jobs at the top, running jobs below that, and all other jobs below that grouped into daily, weekly, monthly, quarterly, and other categories. We essentially monitor SQL Server Agent jobs, not the SSIS packages themselves. We rely on job categories and job schedule naming conventions to automate grouping in the report.

We have a similar setup for monitoring our SSRS subscriptions. However, we only monitor this one once a day since most of our subscriptions are triggered around 3am-4am in the morning. The SSRS Subscription Activity Report goes one step further than the SQL Server Agent Job Activity Report in that it has links to the subscription screen for the report and has more exception handling built into it.

Aside from relying on reports, we also have a few jobs that are set to notify the operator via email upon job completion instead of upon job failure. This makes it easy to quickly check if all the major ETL processes have run successfully. It's sort of an early indicator of the health of the system. If we haven't received this email by the time the first team member gets into the office, then we know something is wrong. We also have a series of jobs that will fail with a job error if certain data sources haven't been loaded by a specific time. Before I had someone working an early shift, I use to check my iPhone for the email anytime I woke up in the middle of the night (which happened a lot back then since I had a newborn baby). On the rare occasion that I didn't receive an email indicating everything completed or I received an error regarding a job step, then I logged onto my machine via remote desktop to check the status of the jobs.

I considered having our data center guys check on the status of the servers by running a report each morning around 4am, but in the end I determined this wouldn't be necessary since we have a person who starts work at 6am. The main concern I had about implementing this process is that our ETL changes over time and it would have been necessary for me to maintain documentation on how to check the jobs properly and how to escalate the notifications to my team when a problem was detected. I would be willing to do this if the processes had to run in the middle of the night. However, our ETL runs every hour of the day, so if we had to kick off all our major ETL processes in the early morning, we would still complete loading our data warehouse and publishing reports before anyone made it into the office. Also, our office starts REALLY late for some reason, so people don't normally run our reports interactively until 9am onward.

Upvotes: 2

Related Questions