Reputation: 1053
From time to time all my pods restart and I'm not sure how to figure out why it's happening. Is there someplace in google cloud where I can get that information? or a kubectl command to run? It happens every couple of months or so. maybe less frequently than that.
Upvotes: 2
Views: 3140
Reputation: 274
It's also a good thing to check your cluster and node-pool operations.
gcloud container operations list
kubectl get nodes
Please note you have to add your cluster and node-pool name in the queries.
Control plane (master) upgraded:
resource.type="gke_cluster"
log_id("cloudaudit.googleapis.com/activity")
protoPayload.methodName:("UpdateCluster" OR "UpdateClusterInternal")
(protoPayload.metadata.operationType="UPGRADE_MASTER"
OR protoPayload.response.operationType="UPGRADE_MASTER")
resource.labels.cluster_name=""
Node-pool upgraded
resource.type="gke_nodepool"
log_id("cloudaudit.googleapis.com/activity")
protoPayload.methodName:("UpdateNodePool" OR "UpdateClusterInternal")
protoPayload.metadata.operationType="UPGRADE_NODES"
resource.labels.cluster_name=""
resource.labels.nodepool_name=""
Upvotes: 3
Reputation: 3301
Using below methods for checking the reason for pod restart:
Use kubectl describe deployment <deployment_name>
and kubectl describe pod <pod_name>
which contains the information.
# Events:
# Type Reason Age From Message
# ---- ------ ---- ---- -------
# Warning BackOff 40m kubelet, gke-xx Back-off restarting failed container
# ..
You can see that the pod is restarted due to image pull backoff. We need to troubleshoot on that particular issue.
Check for logs using : kubectl logs <pod_name>
To get previous logs of your container (the restarted one), you may use --previous key on pod, like this:
kubectl logs your_pod_name --previous
You can also write a final message to /dev/termination-log, and this will show up as described in docs.
Attaching a troubleshooting doc for reference.
Upvotes: 2