Reputation: 622
I have cilium installed in my test cluster (AWS, with the AWS CNI deleted because we use the cilium CNI plugin) and whenever I delete the cilium namespace (or run helm delete
), the hubble-ui
pod gets stuck in terminating state. The pod has a couple of containers, but I notice that one container named backend exits with code 137 when the namespace is deleted, leaving the hubble-ui pod and the namespace that the pod is in, stuck in Terminating
state. From what I am reading online, containers exit with 137 when they attempt to use more memory that they have been allocated. In my test cluster, no resource limits have been defined (spec.containers.[*].resources = {}
) on the pod or namespace. There is no error message displayed as reason for the error. I am using the cilium helm package v1.12.3, but this issue has been going on even before we updated the helm package version.
I would like to know what is causing this issue as it is breaking my CI pipeline. How can I ensure a graceful exit of the backend container? (as opposed to clearing finalizers).
Upvotes: 0
Views: 969
Reputation: 622
So it appears that there is a bug in the backend
application/container for the hubble-ui
service. Kubernetes sends a SIGTERM
signal to the container and it fails to respond. I verified this by getting a shell into the container and sending SIGTERM
and SIGINT
, which is what the application seems to listen for in order to exit and it just doesn’t respond to either signal.
Next, I added a preStop
hook that looks like below and the pod behaved itself
...
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "kill -SIGILL 1; true"]
Upvotes: 0