brgsousa
brgsousa

Reputation: 333

Istio: Injected pod replicas in different nodes can't communicate with istio

Using Istio 1.9.2

kubectl get nodes
NAME     STATUS   ROLES                  AGE    VERSION
k8s-d0   Ready    control-plane,master   7d2h   v1.20.5
k8s-d1   Ready    <none>                 7d2h   v1.20.5
k8s-d2   Ready    <none>                 7d2h   v1.20.5

kubectl get pods -n istio-system -o wide                   
NAME                                    READY   STATUS    RESTARTS   AGE     IP             NODE     NOMINATED NODE   READINESS GATES
istio-egressgateway-54658cd5f5-thj76    1/1     Running   0          7m6s    10.244.1.44    k8s-d1   <none>           <none>
istio-ingressgateway-7cc49dcd99-tfkgs   1/1     Running   0          7m6s    10.244.1.43    k8s-d1   <none>           <none>
istiod-db9f9f86-j2mjv                   1/1     Running   0          7m10s   10.244.2.100   k8s-d2   <none>           <none>

kubectl get pods -n sauron--desenvolvimento -o wide
NAME                      READY   STATUS    RESTARTS   AGE     IP             NODE     NOMINATED NODE   READINESS GATES
sauron-66dc8ff67d-kgsk2   1/2     Running   0          49s     10.244.2.101   k8s-d2   <none>           <none>
sauron-66dc8ff67d-rs8lv   2/2     Running   0          5m27s   10.244.1.46    k8s-d1   <none>           <none>

kubectl get services -n istio-system        
NAME                   TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)                                                                      AGE  
istio-egressgateway    ClusterIP      10.99.11.179     <none>        80/TCP,443/TCP,15443/TCP                                                     9m12s
istio-ingressgateway   LoadBalancer   10.100.239.142   <pending>     15021:30460/TCP,80:30453/TCP,443:31822/TCP,15012:30062/TCP,15443:31932/TCP   9m12s
istiod                 ClusterIP      10.104.84.82     <none>        15010/TCP,15012/TCP,443/TCP,15014/TCP                                        9m16s

When Kubernetes decides to deploy (envoy injected) pods on a node different from istio-ingressgateway's node, envoy sidecar throws this error below and the pod remains unhealthy (Readiness probe failed: Get "http://10.244.2.101:15021/healthz/ready": dial tcp 10.244.2.101:15021: connect: connection refused) :

2021-04-13T17:19:31.674447Z warn ca ca request failed, starting attempt 1 in 101.413465ms
2021-04-13T17:19:31.776221Z warn ca ca request failed, starting attempt 2 in 214.499657ms
2021-04-13T17:19:31.799366Z warning envoy config StreamAggregatedResources gRPC config stream closed: 14, connection error: desc = "transport: Error while dialing dial tcp 10.104.84.82:15012: i/o timeout"
2021-04-13T17:19:31.991122Z warn ca ca request failed, starting attempt 3 in 383.449139ms
2021-04-13T17:19:32.375004Z warn ca ca request failed, starting attempt 4 in 724.528493ms
2021-04-13T17:19:43.196955Z warn Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected
2021-04-13T17:19:45.195718Z warn Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected
2021-04-13T17:19:47.194598Z warn Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected
2021-04-13T17:19:49.194990Z warn Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected
2021-04-13T17:19:51.195400Z warn Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected
2021-04-13T17:19:52.214848Z warning envoy config StreamAggregatedResources gRPC config stream closed: 14, connection error: desc = "transport: Error while dialing dial tcp 10.104.84.82:15012: i/o timeout"
2021-04-13T17:19:52.674981Z warn sds failed to warm certificate: failed to generate workload certificate: create certificate: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 10.104.84.82:15012: i/o timeout"
2021-04-13T17:19:53.195269Z warn Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected

Logging into my pod bashand executing curl, connection is refused:

curl 10.104.84.82:15012
curl: (7) Failed to connect to 10.104.84.82 port 15012: Connection refused

Upvotes: 2

Views: 5094

Answers (2)

Ming CHEN
Ming CHEN

Reputation: 41

I think the above answer kubectl patch svc istio-ingressgateway -n istio-system -p '{"spec":{"externalIPs":["__YOUR_IP__"]}}' works only when the cluster has a LoadBalancer server (like a public cloud vendor, Google Cloud).

If not, we need to change TYPE from 'LoadBalancer' to "NodePort" for the private cluster without LoadBalancer configuration.

Upvotes: 0

brgsousa
brgsousa

Reputation: 333

I found out that the problem was with istio-ingressgateway's service which didn't have a external ip. After setting it with the command bellow, it worked fine:

kubectl patch svc istio-ingressgateway -n istio-system -p '{"spec":{"externalIPs":["__YOUR_IP__"]}}'

Upvotes: 4

Related Questions