Reputation: 2135
I am using Prometheus and Grafana to collect and display metrics information for a Kubernetes cluster. In this case, I am collecting memory information and have discovered that one of the worker nodes does not appear in the results for certain metrics, while it does for other metrics. The only thing I can see that might have something to do with this, is that that node has a taint applied.
Here is the node taint:
nodeType=runner-node:NoExecute
The rest of the worker nodes have no (obvious) taint. Could this be the reason why nothing is being scraped?
Here is an exmaple of a metric that has information for this node (arc-worker-4
):
Query:
machine_memory_bytes{node="arc-worker-4"}
Result:
metric | value |
---|---|
machine_memory_bytes{boot_id="3b6af3e8-d3ae-457a-92be-f7da2adededf", endpoint="https-metrics", instance="172.20.32.14:10250", job="kubelet", machine_id="6c59590e61484bfca6f8da38897d7760", metrics_path="/metrics/cadvisor", namespace="kube-system", node="arc-worker-4", service="prometheus-kube-prometheus-kubelet", system_uuid="c7874d56-2d9d-ce1a-986f-1f549f1784b6"} | 135090417664 |
If run a query on another metric I get no result:
Query:
node_memory_MemTotal_bytes{node="arc-worker-4"}
Result:
Empty query result
In the group of metrics named node_memory_..._bytes
(of which there are about 50), none of these have any data for this node. Why? I get data for all other nodes, including the master node.
Upvotes: 0
Views: 482
Reputation: 2135
Was able to resolve this problem by adding a toleration into the Prometheus (kube-prometheus-stack) config. This allows the node-exporter that came with Prometheus to be deployed onto the node with that taint. I now am getting results from the node_memory_..._bytes
family of metrics.
What was done:
In the Prometheus Helm chart values.yaml, the following was added:
prometheus-node-exporter:
tolerations:
- effect: NoSchedule
operator: Exists
- key: nodeType
operator: Equal
value: runner-node
effect: NoExecute
The first toleration is the default, but needs to be specified here otherwise it's blown away. I needed it so that the master node would still be scraped.
Upvotes: 0