elli
elli

Reputation: 649

AWS EKS nodes creation failure

I have a cluster in AWS created by these instructions.

Then I tried to add nodes in this cluster according to this documentation.

It seems that the nodes fail to be created with vpc-cni and coredns health issue type: insufficientNumberOfReplicas The add-on is unhealthy because it doesn't have the desired number of replicas.

The status of the pods kubectl get pods -n kube-system:

NAME                       READY   STATUS             RESTARTS   AGE
aws-node-9cwkd             0/1     CrashLoopBackOff   13         42m
aws-node-h4qjt             0/1     CrashLoopBackOff   13         42m
aws-node-jrn5x             0/1     CrashLoopBackOff   13         43m
coredns-745979c988-25fcc   0/1     Pending            0          120m
coredns-745979c988-qvh7h   0/1     Pending            0          120m
kube-proxy-2bmlq           1/1     Running            0          42m
kube-proxy-hjcrw           1/1     Running            0          43m
kube-proxy-j9r9n           1/1     Running            0          42m

The logs of aws-node-9cwkd pod:

{"level":"info","ts":"2021-11-30T14:11:14.156Z","caller":"entrypoint.sh","msg":"Validating env variables ..."}
{"level":"info","ts":"2021-11-30T14:11:14.157Z","caller":"entrypoint.sh","msg":"Install CNI binaries.."}
{"level":"info","ts":"2021-11-30T14:11:14.177Z","caller":"entrypoint.sh","msg":"Starting IPAM daemon in the background ... "}
{"level":"info","ts":"2021-11-30T14:11:14.179Z","caller":"entrypoint.sh","msg":"Checking for IPAM connectivity ... "}
{"level":"info","ts":"2021-11-30T14:11:16.189Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2021-11-30T14:11:18.198Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2021-11-30T14:11:20.205Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2021-11-30T14:11:22.215Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2021-11-30T14:11:24.226Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}

By running the command kubectl describe pod aws-node-h4qjt -n kube-system the following error occurs:

Readiness probe failed: {"level":"info","ts":"2021-11-30T14:11:07.145Z","caller":"/usr/local/go/src/runtime/proc.go:225","msg":"timeout: failed to connect service \":50051\" within 5s"}

Any help would be highly appreciated in order to create nodes in the cluster successfully.

Upvotes: 15

Views: 29041

Answers (1)

Daniel Robinson
Daniel Robinson

Reputation: 465

It's most likely a problem with the node service role. You can get more information if you exec into the pod and then view the ipamd.log

kubectl exec -it aws-node-9cwkd -n kube-system -- /bin/bash 
cat /host/var/log/aws-routed-eni/ipamd.log

Here's an example of the error I when I hit the same errors

{"level":"error","ts":"2021-12-02T13:27:51.464Z","caller":"ipamd/ipamd.go:444","msg":"Failed to call ec2:DescribeNetworkInterfaces for [eni-0c01bd25ae6999ed5]: UnauthorizedOperation: You are not authorized to perform this operation.\n\tstatus code: 403, request id: 0438b84b-8052-4f31-9d63-c2ff7512f131"}

In my case I had to add the AmazonEKS_CNI_Policy policy to the node IAM role.

https://docs.aws.amazon.com/eks/latest/userguide/cni-iam-role.html

Upvotes: 18

Related Questions