Reputation: 1085

Pod in Kubernetes always in pending state

I have a problem with Kubernetes that run in a CentOS virtual machine in CloudStack. My pods remain in pending state. I got the following error message when I print the log for a pod:

    [root@kubernetes-master ~]# kubectl logs wildfly-rc-6a0fr
    Error from server: Internal error occurred: Pod "wildfly-rc-6a0fr" in namespace "default" : pod is not in 'Running', 'Succeeded' or 'Failed' state - State: "Pending"

If I launch describe command on the pod, this is the result:

[root@kubernetes-master ~]# kubectl describe pod wildfly-rc-6a0fr
Name:               wildfly-rc-6a0fr
Namespace:          default
Image(s):           jboss/wildfly
Node:               kubernetes-minion1/
Start Time:         Sun, 03 Apr 2016 15:00:20 +0200
Labels:             name=wildfly
Status:             Pending
Reason:             
Message:            
IP:             
Replication Controllers:    wildfly-rc (2/2 replicas created)
Containers:
  wildfly-rc-pod:
    Container ID:   
    Image:      jboss/wildfly
    Image ID:       
    QoS Tier:
      cpu:      BestEffort
      memory:       BestEffort
    State:      Waiting
    Ready:      False
    Restart Count:  0
    Environment Variables:
Volumes:
  default-token-0dci1:
    Type:   Secret (a secret that should populate this volume)
    SecretName: default-token-0dci1
Events:
  FirstSeen LastSeen    Count   From                SubobjectPath               Reason  Message
  ───────── ────────    ─────   ────                ─────────────               ──────  ───────
  8m        8m      1   {kubelet kubernetes-minion1}    implicitly required container POD   Pulled  Container image "registry.access.redhat.com/rhel7/pod-infrastructure:latest" already present on machine
  8m        8m      1   {kubelet kubernetes-minion1}    implicitly required container POD   Created Created with docker id 97c1a3ea4aa5
  8m        8m      1   {kubelet kubernetes-minion1}    implicitly required container POD   Started Started with docker id 97c1a3ea4aa5
  8m        8m      1   {kubelet kubernetes-minion1}    spec.containers{wildfly-rc-pod}     Pulling pulling image "jboss/wildfly"

Kubelet has some errors that I print below.Is this possible because of the vm has only 5GB of storage?

systemctl status -l kubelet
● kubelet.service - Kubernetes Kubelet Server
   Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
   Active: active (running) since lun 2016-04-04 08:08:59 CEST; 9min ago
     Docs: https://github.com/GoogleCloudPlatform/kubernetes
 Main PID: 2112 (kubelet)
   Memory: 39.3M
   CGroup: /system.slice/kubelet.service
           └─2112 /usr/bin/kubelet --logtostderr=true --v=0 --api-servers=http://kubernetes-master:8080 --address=0.0.0.0 --allow-privileged=false --pod-infra-container-image=registry.access.redhat.com/rhel7/pod-infrastructure:latest

apr 04 08:13:33 kubernetes-minion1 kubelet[2112]: W0404 08:13:33.877859    2112 kubelet.go:1690] Orphaned volume "167d0ead-fa29-11e5-bddc-064278000020/default-token-0dci1" found, tearing down volume
apr 04 08:13:53 kubernetes-minion1 kubelet[2112]: W0404 08:13:53.887279    2112 kubelet.go:1690] Orphaned volume "9f772358-fa2b-11e5-bddc-064278000020/default-token-0dci1" found, tearing down volume
apr 04 08:14:35 kubernetes-minion1 kubelet[2112]: I0404 08:14:35.341994    2112 provider.go:91] Refreshing cache for provider: *credentialprovider.defaultDockerConfigProvider
apr 04 08:14:35 kubernetes-minion1 kubelet[2112]: E0404 08:14:35.397168    2112 manager.go:1867] Failed to create pod infra container: impossible: cannot find the mounted volumes for pod "wildfly-rc-oroab_default"; Skipping pod "wildfly-rc-oroab_default"
apr 04 08:14:35 kubernetes-minion1 kubelet[2112]: E0404 08:14:35.401583    2112 pod_workers.go:113] Error syncing pod 167d0ead-fa29-11e5-bddc-064278000020, skipping: impossible: cannot find the mounted volumes for pod "wildfly-rc-oroab_default"
apr 04 08:14:58 kubernetes-minion1 kubelet[2112]: E0404 08:14:58.076530    2112 manager.go:1867] Failed to create pod infra container: impossible: cannot find the mounted volumes for pod "wildfly-rc-1aimv_default"; Skipping pod "wildfly-rc-1aimv_default"
apr 04 08:14:58 kubernetes-minion1 kubelet[2112]: E0404 08:14:58.078292    2112 pod_workers.go:113] Error syncing pod 9f772358-fa2b-11e5-bddc-064278000020, skipping: impossible: cannot find the mounted volumes for pod "wildfly-rc-1aimv_default"
apr 04 08:15:23 kubernetes-minion1 kubelet[2112]: W0404 08:15:23.879138    2112 kubelet.go:1690] Orphaned volume "56257e55-fa2c-11e5-bddc-064278000020/default-token-0dci1" found, tearing down volume
apr 04 08:15:28 kubernetes-minion1 kubelet[2112]: E0404 08:15:28.574574    2112 manager.go:1867] Failed to create pod infra container: impossible: cannot find the mounted volumes for pod "wildfly-rc-43b0f_default"; Skipping pod "wildfly-rc-43b0f_default"
apr 04 08:15:28 kubernetes-minion1 kubelet[2112]: E0404 08:15:28.581467    2112 pod_workers.go:113] Error syncing pod 56257e55-fa2c-11e5-bddc-064278000020, skipping: impossible: cannot find the mounted volumes for pod "wildfly-rc-43b0f_default"

Could someone, kindly, help me?

Upvotes: 54

Answers (7)

Rotem jackoby

Reputation: 22228

First thing to do

Restart your pod and see the first event that comes up.

There might be events that will come after that and they are not the main reason.

Trying to check all the events via kubectl or Lens/K9S might be harder and more time consuming (if the events are still visible).

Some common problems - quick checklist

Resources (check both CPU and Memory).
Wrong priority classes for daemon sets (described below).
Taints/Tolerations.
Node Affinity/Anti Affinity.
Any type of mismatch with all the nodePools in the cluster that are managed by the cluster autoscaler (NodePool is a term of Karpenter).
Volumes/PVC mount or attachment issues.
The Node that the pod is scheduled on is not ready due to Networking issues or that the Node could not join the cluster due to IAM or other issue.

Are the pods controlled by daemon set?

Check the priority class of all daemon sets - in most cases in should be the higest system-node-critical even if you have other workloads managed by non daemon set controllers that are more important to the system/buisness.

The reason is that the daemon set layer, in most cases, should be launched on all nodes, and the other type of workloads can be moved to other node/new node because it will be taken into account by the cluster autoscaler in case that there are lack of resources.

Upvotes: 0

Alex Robinson

Reputation: 13417

The Kubernetes application troubleshooting guide recommends running kubectl describe pod wildfly-rc-6a0fr, which should show why the pod hasn't been moved out of the pending state.

Upvotes: 25

Fahima Mokhtari

Reputation: 2062

It can be due to a resource issue. Try increasing the capactities of the instances.

Upvotes: 1

Putnik

Reputation: 6854

For me neither kubectl get events nor kubectl describe provided enough info, what was helpful is kubectl logs pod_name (optionally with -n namespace_name)

Upvotes: 1

Shambu

Reputation: 2852

Run below command to get the events. This will show the issue ( and all other events) why pod has not be scheduled.

kubectl get events

Upvotes: 70

Pankaj negi

Reputation: 57

This is mostly comes when pod is unable to connect to master server. Its quite common mistake when we setup EKS cluster. People just enable public endpoint access and face this issue. Few important aspect to cover:

Enable private end point access. So that worker node or pod under VPC could able to connect
Setup security group and map with EKS cluster while cluster setup. Ensure pod n worker security group should be added in ingress rule with 443 port access.

Upvotes: 4

Jingpeng Wu

Reputation: 522

I got the same problem. I have both micro instances as controllers, and gpu instances for computation. I found that there is some dns pods are pending, so I scaled up the controller node pool, and the pods started working.