Michael
Michael

Reputation: 3366

Production grade methodology for alerts

Background

Our code is written with:

  1. Unit tests
  2. End to end tests
  3. Code review
  4. Staging process
  5. Deployment process

On the contrary, our alerts are just written and then modified occasionally manually. No quality process at all.

This process is reasonable for simple threshold checks. However, our alerts are sometimes built on complicated queries. Sometimes composed of ~20 lines of a query.

If we accidentally break an alert, it could expose us to production instability since we won't know if some logic or component breaks.

The question

Is there a recommended methodology for validating the quality of complicated alerts?

P.S.

We're using Splunk alerts

Upvotes: 0

Views: 42

Answers (1)

RichG
RichG

Reputation: 9926

Splunk does not have a documented practice for validating alerts, if that's what you are looking for. I suggest you follow a practice similar to that which you use for code. Unit testing is not possible, but you can test modified alerts on a non-production system using either a sample of production data or with synthesized data.

Upvotes: 2

Related Questions