tomen
tomen

Reputation: 539

Prometheus: How to disable 1 rule for 1 specific job_name?

I'm setting prometheus alert (using elasticsearch_exporter) for 2 elasticsearch clusters, 1 with 8 nodes and 1 with 3 node. What I want is to send alert when each cluster lost 1 node, but for now all rules apply for both clusters. So it's not possible.

prometheus.yml file

global:
  scrape_interval: 10s

rule_files:
  - alert.rules.yml

alerting:
  alertmanagers:
  - static_configs:
    - targets:
      - localhost:9093

scrape_configs:
 - job_name: cluster1
   scrape_interval: 30s
   scrape_timeout:  30s
   metrics_path: "/metrics"
   static_configs:
   - targets: ['xxx1:9114' ]
     labels:
       service: cluster1
 - job_name: cluster2
   scrape_interval: 30s
   scrape_timeout:  30s
   metrics_path: "/metrics"
   static_configs:
   - targets: ['xxx2:9114' ]
     labels:
       service: cluster2

alert.rules.yml file:

groups:
- name: alert.rules
  rules:
    - alert: ElasticsearchLostNode
      expr: elasticsearch_cluster_health_number_of_nodes < 8
      for: 1m
      labels:
        severity: warning
      annotations:
        summary: Elasticsearch Healthy Nodes (instance {{ $labels.instance }})
        description: Number Healthy Nodes less than 8
...

Ofc the number_of_nodes < 8 will always be true for small cluster, and if I set < 3, the alert will not triggered when big cluster lost 1 node.

Is there a way to exempt 1 specific rule for 1 specific job_name, or define these rules A applying for 1 specific job_name A, these rules B applying for 1 specific job_name B?

Upvotes: 0

Views: 2092

Answers (1)

Yes, you can create one rule for each job at the alert.rules.yml file:

groups:
- name: alert.rules
  rules:
    - alert: ElasticsearchLostNode1
      expr: elasticsearch_cluster_health_number_of_nodes{job="cluster1"} < 8
      ...
    - alert: ElasticsearchLostNode2
      expr: elasticsearch_cluster_health_number_of_nodes{job="cluster2"} < 3
      ...

Upvotes: 3

Related Questions