Reputation: 528
I am deploying a consul cluster on k8s version 1.9:
Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.6", GitCommit:"9f8ebd171479bec0ada837d7ee641dec2f8c6dd1", GitTreeState:"clean", BuildDate:"2018-03-21T15:21:50Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.3+coreos.0", GitCommit:"f588569ed1bd4a6c986205dd0d7b04da4ab1a3b6", GitTreeState:"clean", BuildDate:"2018-02-10T01:42:55Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"linux/amd64"}
using hashicorp/consul-k8s:0.11.0 for syncCatalog:
Here is my SyncCatalog Deployment description:
Namespace: consul-presentation
CreationTimestamp: Sun, 29 Mar 2020 20:22:49 +0300
Labels: app=consul
chart=consul-helm
heritage=Tiller
release=consul-presentation
Annotations: deployment.kubernetes.io/revision=1
Selector: app=consul,chart=consul-helm,component=sync-catalog,release=consul-presentation
Replicas: 1 desired | 1 updated | 1 total | 0 available | 1 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=consul
chart=consul-helm
component=sync-catalog
release=consul-presentation
Annotations: consul.hashicorp.com/connect-inject=false
Service Account: consul-presentation-consul-sync-catalog
Containers:
consul-sync-catalog:
Image: hashicorp/consul-k8s:0.11.0
Port: <none>
Command:
/bin/sh
-ec
consul-k8s sync-catalog \
-k8s-default-sync=true \
-consul-domain=consul \
-k8s-write-namespace=${NAMESPACE} \
-node-port-sync-type=ExternalFirst \
-log-level=debug \
-add-k8s-namespace-suffix \
Liveness: http-get http://:8080/health/ready delay=30s timeout=5s period=5s #success=1 #failure=3
Readiness: http-get http://:8080/health/ready delay=10s timeout=5s period=5s #success=1 #failure=5
Environment:
HOST_IP: (v1:status.hostIP)
NAMESPACE: (v1:metadata.namespace)
CONSUL_HTTP_ADDR: http://consul-presentation.test:8500
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Available False MinimumReplicasUnavailable
Progressing True ReplicaSetUpdated
OldReplicaSets: <none>
NewReplicaSet: consul-presentation-consul-sync-catalog-66b5756486 (1/1 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 1m deployment-controller Scaled up replica set consul-presentation-consul-sync-catalog-66b5756486 to 1
And here is the description of the unhealthy pod:
kubectl describe pod consul-presentation-consul-sync-catalog-66b5756486-2h2s6 -n consul-presentation
Name: consul-presentation-consul-sync-catalog-66b5756486-2h2s6
Namespace: consul-presentation
Node: k8s-k4.test/10.99.1.10
Start Time: Sun, 29 Mar 2020 20:22:49 +0300
Labels: app=consul
chart=consul-helm
component=sync-catalog
pod-template-hash=2261312042
release=consul-presentation
Annotations: consul.hashicorp.com/connect-inject=false
Status: Running
IP: 10.195.5.53
Controlled By: ReplicaSet/consul-presentation-consul-sync-catalog-66b5756486
Containers:
consul-sync-catalog:
Container ID: docker://4f0c65a7be5f9b07cae51d798c532a066fb0784b28a7610dfe4f1a15a2fa5a7c
Image: hashicorp/consul-k8s:0.11.0
Image ID: docker-pullable://hashicorp/consul-k8s@sha256:8be1598ad3e71323509727162f20ed9c140c8cf09d5fa3dc351aad03ec2b0b70
Port: <none>
Command:
/bin/sh
-ec
consul-k8s sync-catalog \
-k8s-default-sync=true \
-consul-domain=consul \
-k8s-write-namespace=${NAMESPACE} \
-node-port-sync-type=ExternalFirst \
-log-level=debug \
-add-k8s-namespace-suffix \
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 2
Started: Sun, 29 Mar 2020 20:28:19 +0300
Finished: Sun, 29 Mar 2020 20:28:56 +0300
Ready: False
Restart Count: 6
Liveness: http-get http://:8080/health/ready delay=30s timeout=5s period=5s #success=1 #failure=3
Readiness: http-get http://:8080/health/ready delay=10s timeout=5s period=5s #success=1 #failure=5
Environment:
HOST_IP: (v1:status.hostIP)
NAMESPACE: consul-presentation (v1:metadata.namespace)
CONSUL_HTTP_ADDR: http://consul-presentation.test:8500
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from consul-presentation-consul-sync-catalog-token-jxw26 (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
consul-presentation-consul-sync-catalog-token-jxw26:
Type: Secret (a volume populated by a Secret)
SecretName: consul-presentation-consul-sync-catalog-token-jxw26
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 7m default-scheduler Successfully assigned consul-presentation-consul-sync-catalog-66b5756486-2h2s6 to k8s-k4.test
Normal SuccessfulMountVolume 7m kubelet, k8s-k4.test MountVolume.SetUp succeeded for volume "consul-presentation-consul-sync-catalog-token-jxw26"
Normal Pulled 6m (x2 over 7m) kubelet, k8s-k4.test Container image "hashicorp/consul-k8s:0.11.0" already present on machine
Normal Created 6m (x2 over 7m) kubelet, k8s-k4.test Created container
Normal Started 6m (x2 over 7m) kubelet, k8s-k4.test Started container
Normal Killing 6m kubelet, k8s-k4.test Killing container with id docker://consul-sync-catalog:Container failed liveness probe.. Container will be killed and recreated.
Warning Unhealthy 6m (x4 over 6m) kubelet, k8s-k4.test Liveness probe failed: HTTP probe failed with statuscode: 500
Warning Unhealthy 6m (x13 over 7m) kubelet, k8s-k4.test Readiness probe failed: HTTP probe failed with statuscode: 500
Warning BackOff 2m (x6 over 3m) kubelet, k8s-k4.test Back-off restarting failed container
I have tried the default trial as described in this helm chart: https://github.com/hashicorp/consul-helm
The only difference is I use ClusterIPs and ingresses which cannot have something to do with the health off a pod.
Any ideas?
Upvotes: 0
Views: 827
Reputation: 46
The liveness probe failing is telling you that the sync-catalog process cannot talk to Consul. Here is how the liveness/readiness probe is implemented in consul-k8s.
It looks like the Consul address you're providing to the sync-catalog process is http://consul-presentation.test:8500
. Is this an external Consul server? Is it running and reachable from the pods on Kubernetes?
Also, are you deploying Consul clients on k8s? In the official Helm chart sync-catalog talks to the Consul clients deployed as a daemonset via hostIP
.
Upvotes: 3
Reputation: 528
When using k8s ingresses with ClusterIPs the consul address should be set to the ingress host, as it is actually exposed, without the port. That means that the corresponding part of the k8s deployment should be like that:
Liveness: http-get http://:8080/health/ready delay=30s timeout=5s period=5s #success=1 #failure=3
Readiness: http-get http://:8080/health/ready delay=10s timeout=5s period=5s #success=1 #failure=5
Environment:
HOST_IP: (v1:status.hostIP)
NAMESPACE: (v1:metadata.namespace)
CONSUL_HTTP_ADDR: http://{INGRESS HOST}
Upvotes: 0