Reputation: 65
I am trying to set up some alert rules in Prometheus so that I can be alerted when an instance is down but when I click on the rules icon on the prometheus UI I see no set up config rules for alerting.
I am testing this locally on my computer and I have the docker prometheus, alertmanager, prom node_exporter and some other app listed on the
Please help...
prometheus.yml file as shown below PWD - /Users/spencer.ecas/ops/prometheus.yml
global:
scrape_interval: 15s
scrape-timeout; 10s
evaluation_interval: 15s
external_labels:
monitor: 'spencer'
alerting:
alertmanagers:
- static_configs:
- targets:
- localhost:9093
rule_files:
- alert.rules.yml
scrape_configs:
- job_name: 'prometheus'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']
labels:
group: 'prometheus-server'
- job_name: 'bis'
scrape_interval: 5s
metrics_path: /actor/prometheus
static_configs:
- targets: ['host.docker.internal:8790']
labels:
group: 'prometheus-bi-sanbox'
- job_name: "node"
scrape_interval: 5s
static_configs:
- targets: ['host.docker.internal:9100']
labels:
group: 'nodeexporter-server
alert.rules.yml PWD - /Users/spencer.ecas/ops/prometheus/alert.rules.yml
groups:
- name: alert.rules
rules:
- alert: InstanceDown
expr: up == 0
for: 1m
labels:
severity: "critical"
annotations:
summary: "Endpoint {{ $labels.instance }} down"
description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minutes."
- alert: HostOutOfMemory
expr: node_memory_MemAvailable / node_memory_MemTotal * 100 < 25
for: 5m
labels:
severity: warning
annotations:
summary: "Host out of memory (instance {{ $labels.instance }})"
description: "Node memory is filling up (< 25% left)\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"
- alert: HostOutOfDiskSpace
expr: (node_filesystem_avail{mountpoint="/"} * 100) / node_filesystem_size{mountpoint="/"} < 50
for: 1s
labels:
severity: warning
annotations:
summary: "Host out of disk space (instance {{ $labels.instance }})"
description: "Disk is almost full (< 50% left)\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"
- alert: HostHighCpuLoad
expr: (sum by (instance) (irate(node_cpu{job="node_exporter_metrics",mode="idle"}[5m]))) > 80
for: 5m
labels:
severity: warning
annotations:
summary: "Host high CPU load (instance {{ $labels.instance }})"
description: "CPU load is > 80%\n VALUE = {{ $value }}\n LABELS: {{ $labels }}"`
alertmanager.yml PWD - /Users/spencer.ecas/ops/alertmanager/alertmanager.yml
Here I am trying to forward the alerts to my slack channel
global:
resolve_timeout: 5m
route:
group_by: ['alertname']
group_wait: 10s
group_interval: 10s
repeat_interval: 1h
receiver: 'slack-notifications'
receivers:
- name: 'slack-notifications'
slack_configs:
- api_url: "https://hooks.slack.com/services/T06J2AUUR/B03CYRJPBPC/HcgsYeG1jjbduwb"
channel: '#alertmanager'
send_resolved: true`
Upvotes: 0
Views: 2231
Reputation: 486
Everything seems properly done but the issue here could be how you spinned up your prometheus and alert-manager servers which are inside the prometheus.yml file.
Secondly on your promtheus.yml file, are you sure that the config file is reading the alert rules from
rule_files:
- alert.rules.yml
So please edit the prometheus.yml file and under the rule_files use this path instead
rule_files:
- "/etc/prometheus/alert.rules.yml"
I will suggest that you remove both alertmanager and prometheus containers and use the command below. The reason for spinning up prometheus container together with the alert.rules.yml config location is so that the alert.rules will be persistent on the prometheus container since the rules will be used on the prometheus server to trigger alerts
Make sure you create a directory like this before using the command
You should have the prometheus.yml file inside the /Users/spencer.ecas/ops/prometheus
docker run -d --name prometheus_ops -p 9191:9090 -v $(pwd)/prometheus.yml:/etc/prometheus/prometheus.yml -v $(pwd)/alert.rules.yml:/etc/prometheus/alert.rules.yml prom/prometheus
This is just a better display of the command above - Treat them as the same
docker run -d --name prometheus_ops -p 9191:9090 -v $(pwd)/prometheus.yml:/etc/prometheus/prometheus.yml -v $(pwd)/alert.rules.yml:/etc/prometheus/alert.rules.yml prom/prometheus
Upvotes: 3