Paul
Paul

Reputation: 906

k3s - networking between pods not working

I'm struggling with this cross-communication between pods even though clusterIP services are set up for them. All the pods are on the same master node, and in the same namespace. In Summary:

$ kubectl get pods -o wide
NAME                         READY   STATUS    RESTARTS   AGE    IP           NODE          NOMINATED NODE   READINESS GATES
nginx-744f4df6df-rxhph       1/1     Running   0          136m   10.42.0.31   raspberrypi   <none>           <none>
nginx-2-867f4f8859-csn48     1/1     Running   0          134m   10.42.0.32   raspberrypi   <none>           <none>

$ kubectl get svc -o wide
NAME             TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE    SELECTOR
nginx-service    ClusterIP   10.43.155.201   <none>        80/TCP                       136m   app=nginx
nginx-service2   ClusterIP   10.43.182.138   <none>        85/TCP                       134m   app=nginx-2

where I can't curl http://nginx-service2:85 from within nginx container, or vice versa... while I validated that this worked from my docker desktop installation:

# docker desktop
root@nginx-7dc45fbd74-7prml:/# curl http://nginx-service2:85
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

# k3s
root@nginx-744f4df6df-rxhph:/# curl http://nginx-service2.pwk3spi-vraptor:85
curl: (6) Could not resolve host: nginx-service2.pwk3spi-vraptor

After googling the issue (and please correct me if I'm wrong) it seems like a coredns issue, because looking at the logs and see the error timeouts:

$ kubectl get pods -n kube-system
NAME                                     READY   STATUS      RESTARTS   AGE
helm-install-traefik-qr2bd               0/1     Completed   0          153d
metrics-server-7566d596c8-nnzg2          1/1     Running     59         148d
svclb-traefik-kjbbr                      2/2     Running     60         153d
traefik-758cd5fc85-wzjrn                 1/1     Running     20         62d
local-path-provisioner-6d59f47c7-4hvf2   1/1     Running     72         148d
coredns-7944c66d8d-gkdp4                 1/1     Running     0          3m47s

$ kubectl logs coredns-7944c66d8d-gkdp4 -n kube-system
.:53
[INFO] plugin/reload: Running configuration MD5 = 1c648f07b77ab1530deca4234afe0d03
CoreDNS-1.6.9
linux/arm, go1.14.1, 1766568
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:50482->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:34160->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:53485->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:46642->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:55329->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:44471->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:49182->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:54082->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:48151->192.168.8.109:53: i/o timeout
[ERROR] plugin/errors: 2 1898797220.1916943194. HINFO: read udp 10.42.0.38:48599->192.168.8.109:53: i/o timeout

where people recommended

... other CoreFile stuff

forward . host server IP

... other CoreFile stuff

search default.svc.cluster.local svc.cluster.local cluster.local

nameserver 10.42.0.38

nameserver 192.168.8.1

nameserver fe80::266:19ff:fea7:85e7%wlan0

, however didn't find that these solutions worked.

Details for reference:

$ kubectl get nodes -o wide
NAME          STATUS   ROLES    AGE    VERSION        INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                         KERNEL-VERSION   CONTAINER-RUNTIME
raspberrypi   Ready    master   153d   v1.18.9+k3s1   192.168.8.109   <none>        Raspbian GNU/Linux 10 (buster)   5.10.9-v7l+      containerd://1.3.3-k3s2

$ kubectl get svc -n kube-system -o wide
NAME                 TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)                      AGE    SELECTOR
kube-dns             ClusterIP      10.43.0.10      <none>          53/UDP,53/TCP,9153/TCP       153d   k8s-app=kube-dns
metrics-server       ClusterIP      10.43.205.8     <none>          443/TCP                      153d   k8s-app=metrics-server
traefik-prometheus   ClusterIP      10.43.222.138   <none>          9100/TCP                     153d   app=traefik,release=traefik
traefik              LoadBalancer   10.43.249.133   192.168.8.109   80:31222/TCP,443:32509/TCP   153d   app=traefik,release=traefik

$ kubectl get ep kube-dns -n kube-system
NAME       ENDPOINTS                                     AGE
kube-dns   10.42.0.38:53,10.42.0.38:9153,10.42.0.38:53   153d

No idea where I'm going wrong, or if I focused on the wrong stuff, or how to continue. Any help will be much appreciated, please.

Upvotes: 3

Views: 6521

Answers (4)

Hieu nguyen
Hieu nguyen

Reputation: 21

In my case , follow rancher docs:

The nodes need to be able to reach other nodes over UDP port 8472 when Flannel VXLAN is used or over UDP ports 51820 and 51821 (when using IPv6) when Flannel Wireguard backend is used.

I just open udp port in Oracle cloud and it's work

Upvotes: 0

Linh Nguyen
Linh Nguyen

Reputation: 3890

for anyone that don't want to waste 3 hours like me on centos using k3s you need to disable firewall for those services to call eachother

https://rancher.com/docs/k3s/latest/en/advanced/#additional-preparation-for-red-hat-centos-enterprise-linux

It is recommended to turn off firewalld:

systemctl disable firewalld --now

If enabled, it is required to disable nm-cloud-setup and reboot the node:

systemctl disable nm-cloud-setup.service nm-cloud-setup.timer
reboot

after i disabled it, the services was able to call each other through dns name in my Config

still looking for a better way then disable firewall, but that depend on the developer of the k3s project

Upvotes: 2

Paul
Paul

Reputation: 906

When all else fails..... go back to the manual. I tried finding the 'issue' in all the wrong places, while I just had to follow Rancher's installation documentation for k3s (sigh).

Rancher's documentation is very good (you just have to actually follow it), where they state that when installing k3s on Raspbian Buster environments

check version:

$ lsb_release -a
No LSB modules are available.
Distributor ID: Raspbian
Description:    Raspbian GNU/Linux 10 (buster)
Release:        10
Codename:       buster

you need to change to legacy iptables, stating to run (link):

sudo iptables -F
sudo update-alternatives --set iptables /usr/sbin/iptables-legacy
sudo update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy
sudo reboot

note that when setting the iptables, do it directly on the pi, not via ssh. You will be kicked out

After doing this, all my services were happy, and could curl each other from within the containers via their defined clusterIP service names etc.

Upvotes: 4

antaxify
antaxify

Reputation: 342

Is there a reason why you try to curl this address:

curl http://nginx-service2.pwk3spi-vraptor:85

Shouldn't this be just:

curl http://nginx-service2:85

Upvotes: -1

Related Questions