Reputation: 69

Pods running on the same node can't access to each other through service

I have installed a kubernetes cluster on Azure with kubespray 2.13.2. But after I have installed some pods of my data platform components, I have noticed that the pods running on the same node cannot access to each other through service.

For example, my presto coordinator has to access hive metastore. Let's see the services in my namespace:

kubectl get svc -n ai-developer
NAME                                              TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)                      AGE
metastore                                 ClusterIP      10.233.12.66    <none>           9083/TCP                     4h53m

Hive Metastore service is called metastore, through which my presto coordinator has to access hive metastore pod. Let's see the following pods in my namespace:

kubectl get po -n ai-developer -o wide
NAME                                          READY   STATUS      RESTARTS   AGE     IP             NODE       NOMINATED NODE   READINESS GATES
metastore-5544f95b6b-cqmkx                    1/1     Running     0          9h      10.233.69.20   minion-3   <none>           <none>
presto-coordinator-796c4c7bcd-7lngs           1/1     Running     0          5h32m   10.233.69.29   minion-3   <none>           <none>
presto-worker-0                               1/1     Running     0          5h32m   10.233.67.52   minion-1   <none>           <none>
presto-worker-1                               1/1     Running     0          5h32m   10.233.70.24   minion-4   <none>           <none>
presto-worker-2                               1/1     Running     0          5h31m   10.233.68.24   minion-2   <none>           <none>
presto-worker-3                               1/1     Running     0          5h31m   10.233.71.27   minion-0   <none>           <none>

Take a look at that the hive metastore pod metastore-5544f95b6b-cqmkx which is running on the node minion-3 on which presto coordinator pod presto-coordinator-796c4c7bcd-7lngs also is running.

I have configured hive metastore url of thrift://metastore:9083 to hive properties for hive catalog in presto coordinator. When the presto pods are running on that same node where hive metastore pod is running, they cannot access to my hive metastore, but the pod running on other node where hive metastore is not running can access to the hive metastore through service very well.

I have mentioned just one example, but I have experienced several other cases like this example for now.

kubenet is installed as network plugin in my kubernetes cluster installed with kubespray on azure:

/usr/local/bin/kubelet --logtostderr=true --v=2 --node-ip=10.240.0.4 --hostname-override=minion-3 --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --config=/etc/kubernetes/kubelet-config.yaml --kubeconfig=/etc/kubernetes/kubelet.conf --pod-infra-container-image=k8s.gcr.io/pause:3.1 --runtime-cgroups=/systemd/system.slice --hairpin-mode=promiscuous-bridge --network-plugin=kubenet --cloud-provider=azure --cloud-config=/etc/kubernetes/cloud_config

Any idea?

Upvotes: 2

Answers (5)

Rm4n

Reputation: 888

I was using flannel as CNI on Kubernetes v1.30.1 and it turned out that flannel needs masquerade to be set true while kube-proxy default value has masqueradeAll: false. Changing it to true and restarting kube-proxy pods solved the problem (AND I FINALLY GOT TO SOLVE IT AT 4 AM!!!).

the steps:

kubectl -n kube-system edit cm kube-proxy to set masqueradeAll: true
kubectl -n kube-system delete pod -l k8s-app=kube-proxy to restart all kube proxy pods

Upvotes: 1

emvidi

Reputation: 1310

In my case it was the br_netfilter module which did not survive a reboot, so the vxlan it did not work.

Upvotes: 0

mykidong

Reputation: 69

After I have changed ipvs of kube proxy mode to iptables, it works fine!

Upvotes: 0

Paul

Reputation: 11

Please check if the iptables Chain FORWARD default policy is ACCEPT . In my case , set the Forward chain default policy from drop to accept, the communitcation between nodes works well.

Upvotes: 1

Yaron Idan

Reputation: 6765

you might be able to overcome this issue by using the fully qualified name k8s provides you for resolving service ips, as described in the k8s docsenter link description here.

In your case it will probably mean changing your thrift://metastore:9083 property to thrift://metastore.ai-developer.svc.cluster.local (Assuming, of course, your cluster domain is configured to be cluster.local)

Upvotes: 0

Pods running on the same node can&#39;t access to each other through service

Answers (5)

Related Questions

Pods running on the same node can't access to each other through service