u123
u123

Reputation: 16267

The node had condition: [DiskPressure] causing pod eviction in k8s in azure/aks

I am running k8s 1.14 in azure.

I keep getting pod evictions on some of my pods in the cluster.

As an example:

$ kubectl describe pod kube-prometheus-stack-prometheus-node-exporter-j8nkd
...
Events:
  Type     Reason     Age    From                             Message
  ----     ------     ----   ----                             -------
  Normal   Scheduled  3m22s  default-scheduler                Successfully assigned monitoring/kube-prometheus-stack-prometheus-node-exporter-j8nkd to aks-default-2678****
  Warning  Evicted    3m22s  kubelet, aks-default-2678****  The node had condition: [DiskPressure].

Which I can also confirm by:

$ kubectl describe node aks-default-2678****
...
Unschedulable:      false
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  NetworkUnavailable   False   Wed, 27 Nov 2019 22:06:08 +0100   Wed, 27 Nov 2019 22:06:08 +0100   RouteCreated                 RouteController created a route
  MemoryPressure       False   Fri, 23 Oct 2020 15:35:52 +0200   Mon, 25 May 2020 18:51:40 +0200   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         True    Fri, 23 Oct 2020 15:35:52 +0200   Sat, 05 Sep 2020 14:36:59 +0200   KubeletHasDiskPressure       kubelet has disk pressure

Since this is a managed azure k8s cluster I don't have access to the kubelet on the nodes or the master nodes. Is there anything I can do to investigate/debug this problem without SSH access to the nodes?

Also I assume this comes from storage on the nodes and not from PV/PVC which have been mounted into the pods. So how do I get an overview of storage consumption on the worker nodes without SSH access?

Upvotes: 0

Views: 5117

Answers (1)

Matt
Matt

Reputation: 8132

So how do I get an overview of storage consumption on the worker nodes without SSH access?

You can create privileged pod like following:

apiVersion: v1
kind: Pod
metadata:
  labels:
    run: privileged-pod
  name: privileged-pod
spec:
  hostIPC: true
  hostNetwork: true
  hostPID: true
  containers:
  - args:
    - sleep
    - "9999"
    image: centos:7
    name: privileged-pod
    volumeMounts:
    - name: host-root-volume
      mountPath: /host
      readOnly: true
  volumes:
  - name: host-root-volume
    hostPath:
      path: /

and then exec to it:

kubectl exec -it privileged-pod -- chroot /host

and then you have access to the whole node, just like you would have using ssh.

Note: In case your k8s user has attached pod-security-policy you may not be able to do this, if changeing hostIPC, hostNetwork and hostPID is disallowed.

You also need to make sure that the pod gets scheduled on specific node that you want to have acccess to. Use .spec.nodeName: <name> to acheive it.

Upvotes: 3

Related Questions