Johnathan
Johnathan

Reputation: 875

alerting_rules.yml in helm values.yaml

I have installed prometheus into an AWS EKS Kubernetes cluster using a helm chart, and I am now trying to configure In the values.yaml file for the chart I am now trying to add an alert.

There is an example in the file already that looks like this

## Prometheus server ConfigMap entries
##
serverFiles:

  ## Alerts configuration
  ## Ref: https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/
  alerting_rules.yml: {}
  # groups:
  #   - name: Instances
  #     rules:
  #       - alert: InstanceDown
  #         expr: up == 0
  #         for: 5m
  #         labels:
  #           severity: page
  #         annotations:
  #           description: '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.'
  #           summary: 'Instance {{ $labels.instance }} down'

When I am uncommenting this example and trying to update the helm deployment I get an error Error: cannot load values.yaml: error converting YAML to JSON: yaml: line 1282: did not find expected node content

The line it complains about is the groups: line in

serverFiles:


  ## Alerts configuration
  ## Ref: https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/
  alerting_rules.yml: {
  groups:
  - name: Instances
    rules:
      - alert: InstanceDown
        expr: up == 0
        for: 5m
        labels:
          severity: page
        annotations:
          description: '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.'
          summary: 'Instance {{ $labels.instance }} down'
  }

I'm not sure what I am doing wrong here.

I have tried with another alert but it gives the same error

serverFiles:


  ## Alerts configuration
  ## Ref: https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/
  alerting_rules.yml: {
    groups:
      - name: pod restarted
        rules:
        - alert: PodRestarted
          expr: job:rate(kube_pod_container_status_restarts_total[1h]) * 3600 > 1
          for: 5s
          labels:
            severity: High
          annotations:
            summary: Pod restarted
  }

Upvotes: 1

Views: 2944

Answers (1)

Johnathan
Johnathan

Reputation: 875

Seems removing the {} solved it.

Example

serverFiles:


  ## Alerts configuration
  ## Ref: https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/
  alerting_rules.yml:
    groups:
      - name: pod restarted
        rules:
        - alert: PodRestarted
          expr: kube_pod_container_status_restarts_total < 1
          for: 0s
          labels:
            severity: High
          annotations:
            summary: Pod restarted

Upvotes: 1

Related Questions