Reputation: 110
I've been attempting to use Prometheus for the purpose of monitoring pod statistics for http_request_rate and/or packets_per_second. To do so, I was planning on using the Prometheus Adapter, from what I've read, requires the use of the Prometheus Operator.
I've had issues installing the Prometheus Operator from the helm stable charts. When running the installation command "helm install prom stable/prometheus-operator" I am getting the following warning message displayed six times
$ manifest_sorter.go:192 info: skipping unknown hook: "crd-install".
The installation continues and the pods are deployed however, the prometheus-node-exporter pod goes into status: CrashLoopBackOff.
I can't see a detailed reason for this as the message when describing the pods is "Back-off restarting failed container"
I'm running Minikube on version: 1.7.2.
I'm running Helm on version: 3.1.1.
>>>Update<<<
Output of Describing Problematic Pod
> $ kubectl describe pod prom-oper-prometheus-node-exporter-2m6vm -n default
>
> Name: prom-oper-prometheus-node-exporter-2m6vm Namespace:
> default Priority: 0 Node: max-ubuntu/10.2.40.198 Start
> Time: Wed, 04 Mar 2020 18:06:44 +0000 Labels:
> app=prometheus-node-exporter
> chart=prometheus-node-exporter-1.8.2
> controller-revision-hash=68695df4c5
> heritage=Helm
> jobLabel=node-exporter
> pod-template-generation=1
> release=prom-oper Annotations: <none> Status: Running IP: 10.2.40.198 IPs: IP: 10.2.40.198
> Controlled By: DaemonSet/prom-oper-prometheus-node-exporter
> Containers: node-exporter:
> Container ID: docker://50b2398f72a0269672c4ac73bbd1b67f49732362b4838e16cd10e3a5247fdbfe
> Image: quay.io/prometheus/node-exporter:v0.18.1
> Image ID: docker-pullable://quay.io/prometheus/node-exporter@sha256:a2f29256e53cc3e0b64d7a472512600b2e9410347d53cdc85b49f659c17e02ee
> Port: 9100/TCP
> Host Port: 9100/TCP
> Args:
> --path.procfs=/host/proc
> --path.sysfs=/host/sys
> --web.listen-address=0.0.0.0:9100
> --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/)
> --collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$
> State: Waiting
> Reason: CrashLoopBackOff
> Last State: Terminated
> Reason: Error
> Exit Code: 1
> Started: Wed, 04 Mar 2020 18:10:10 +0000
> Finished: Wed, 04 Mar 2020 18:10:10 +0000
> Ready: False
> Restart Count: 5
> Liveness: http-get http://:9100/ delay=0s timeout=1s period=10s #success=1 #failure=3
> Readiness: http-get http://:9100/ delay=0s timeout=1s period=10s #success=1 #failure=3
> Environment: <none>
> Mounts:
> /host/proc from proc (ro)
> /host/sys from sys (ro)
> /var/run/secrets/kubernetes.io/serviceaccount from prom-oper-prometheus-node-exporter-token-n9dj9 (ro) Conditions: Type
> Status Initialized True Ready False
> ContainersReady False PodScheduled True Volumes: proc:
> Type: HostPath (bare host directory volume)
> Path: /proc
> HostPathType: sys:
> Type: HostPath (bare host directory volume)
> Path: /sys
> HostPathType: prom-oper-prometheus-node-exporter-token-n9dj9:
> Type: Secret (a volume populated by a Secret)
> SecretName: prom-oper-prometheus-node-exporter-token-n9dj9
> Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: :NoSchedule
> node.kubernetes.io/disk-pressure:NoSchedule
> node.kubernetes.io/memory-pressure:NoSchedule
> node.kubernetes.io/network-unavailable:NoSchedule
> node.kubernetes.io/not-ready:NoExecute
> node.kubernetes.io/pid-pressure:NoSchedule
> node.kubernetes.io/unreachable:NoExecute
> node.kubernetes.io/unschedulable:NoSchedule Events: Type Reason Age From
> Message ---- ------ ---- ----
> ------- Normal Scheduled 5m26s default-scheduler Successfully assigned default/prom-oper-prometheus-node-exporter-2m6vm
> to max-ubuntu Normal Started 4m28s (x4 over 5m22s) kubelet,
> max-ubuntu Started container node-exporter Normal Pulled
> 3m35s (x5 over 5m24s) kubelet, max-ubuntu Container image
> "quay.io/prometheus/node-exporter:v0.18.1" already present on machine
> Normal Created 3m35s (x5 over 5m24s) kubelet, max-ubuntu
> Created container node-exporter Warning BackOff 13s (x30 over
> 5m18s) kubelet, max-ubuntu Back-off restarting failed container
Output of Problematic Pod Logs
> $ kubectl logs prom-oper-prometheus-node-exporter-2m6vm -n default
> time="2020-03-04T18:18:02Z" level=info msg="Starting node_exporter
> (version=0.18.1, branch=HEAD,
> revision=3db77732e925c08f675d7404a8c46466b2ece83e)"
> source="node_exporter.go:156" time="2020-03-04T18:18:02Z" level=info
> msg="Build context (go=go1.12.5, user=root@b50852a1acba,
> date=20190604-16:41:18)" source="node_exporter.go:157"
> time="2020-03-04T18:18:02Z" level=info msg="Enabled collectors:"
> source="node_exporter.go:97" time="2020-03-04T18:18:02Z" level=info
> msg=" - arp" source="node_exporter.go:104" time="2020-03-04T18:18:02Z"
> level=info msg=" - bcache" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - bonding"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - conntrack" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - cpu"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - cpufreq" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - diskstats"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - edac" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - entropy"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - filefd" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - filesystem"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - hwmon" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - infiniband"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - ipvs" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - loadavg"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - mdadm" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - meminfo"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - netclass" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - netdev"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - netstat" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - nfs"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - nfsd" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - pressure"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - sockstat" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - stat"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - textfile" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - time"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - timex" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - uname"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - vmstat" source="node_exporter.go:104"
> time="2020-03-04T18:18:02Z" level=info msg=" - xfs"
> source="node_exporter.go:104" time="2020-03-04T18:18:02Z" level=info
> msg=" - zfs" source="node_exporter.go:104" time="2020-03-04T18:18:02Z"
> level=info msg="Listening on 0.0.0.0:9100"
> source="node_exporter.go:170" time="2020-03-04T18:18:02Z" level=fatal
> msg="listen tcp 0.0.0.0:9100: bind: address already in use"
> source="node_exporter.go:172"
Upvotes: 0
Views: 3638
Reputation: 110
This issue turned out to be caused by the fact that Minikube
was being run with --vm-driver=none
. To solve the issue, Minikube was rebuilt using --vm-driver=kvm2
with --memory=6g
. This allowed the installation of stable/prometheus-operator and all pods ran without crashing.
Upvotes: 1
Reputation: 14102
This is one of known issue related to Helm 3
. It affected many charts as argo or ambassador. You can find in Helm docs info that crd-install
hook was removed:
Note that the
crd-install
hook has been removed in favor of thecrds/
directory in Helm 3.
I've deployed this chart, also get information that Helm
skipped unknown hook but don't have issue with pods.
The alternative way is to create CRD's
before install chart. Steps to do that can be found here.
In first step you have commands to create CRD's:
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.36/example/prometheus-operator-crd/monitoring.coreos.com_alertmanagers.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.36/example/prometheus-operator-crd/monitoring.coreos.com_podmonitors.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.36/example/prometheus-operator-crd/monitoring.coreos.com_prometheuses.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.36/example/prometheus-operator-crd/monitoring.coreos.com_prometheusrules.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.36/example/prometheus-operator-crd/monitoring.coreos.com_servicemonitors.yaml
kubectl apply -f https://raw.githubusercontent.com/coreos/prometheus-operator/release-0.36/example/prometheus-operator-crd/monitoring.coreos.com_thanosrulers.yaml
Last step is execute Helm install
:
helm install --name my-release stable/prometheus-operator --set prometheusOperator.createCustomResource=false
But Helm 3
will not recognize --name
flag.
Error: unknown flag: --name
You have to remove this flag. It should look like:
$ helm install prom-oper stable/prometheus-operator --set prometheusOperator.createCustomResource=false
NAME: prom-oper
LAST DEPLOYED: Wed Mar 4 14:12:35 2020
NAMESPACE: default
STATUS: deployed
REVISION: 1
NOTES:
The Prometheus Operator has been installed. Check its status by running:
kubectl --namespace default get pods -l "release=prom-oper"
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
alertmanager-prom-oper-prometheus-opera-alertmanager-0 2/2 Running 0 9m46s
...
prom-oper-prometheus-node-exporter-25b27 1/1 Running 0 9m56s
If you will have some issues regarding repo you just need to execute:
helm repo add stable https://kubernetes-charts.storage.googleapis.com
helm repo update
If this alternative way won't help, please add to your question output of:
kubectl describe <pod-name> -n <pod-namespace>
and kubectl logs <pod-name> -n <pod-namespace>
Upvotes: 3