Archean
Archean

Reputation: 11

Cilium 1.12.1 not working on some of nodes

Hi guys I have an 11 nodes Kubernetes cluster with cilium 1.12.1 kubeProxyReplacement=strict built on bare metal in our data center, but pods on 4 of the nodes(node5-node8) have issues when communicate with other pods or service which not on the same node, other 7 nodes don't have the issue. I can ping to other pods IP, but when telnet the port, packages seems never arrived.

All the 11 nodes installed the same version of OS, same kernel, and the cluster is deployed with Kubespray, I made sure that the 11 nodes had the same software environment as much as possible, (I’m not sure if it has anything to do with the hardware, but the 4 problematic nodes were gigabit NIC servers and the others were all 10 gigabit NICs.)

This is the node list:

❯ kubectl get nodes -o wide
NAME                          STATUS   ROLES           AGE   VERSION   INTERNAL-IP      EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION    CONTAINER-RUNTIME
master01   Ready    control-plane   39h   v1.24.4   10.252.55.22     <none>        CentOS Linux 7 (Core)   5.10.0-1.0.0.17   containerd://1.6.8
master02   Ready    control-plane   39h   v1.24.4   10.252.54.44     <none>        CentOS Linux 7 (Core)   5.10.0-1.0.0.17   containerd://1.6.8
master03   Ready    control-plane   39h   v1.24.4   10.252.55.39     <none>        CentOS Linux 7 (Core)   5.10.0-1.0.0.17   containerd://1.6.8
node05     Ready    <none>          39h   v1.24.4   10.252.34.27     <none>        CentOS Linux 7 (Core)   5.10.0-1.0.0.17   containerd://1.6.8
node06     Ready    <none>          39h   v1.24.4   10.252.33.44     <none>        CentOS Linux 7 (Core)   5.10.0-1.0.0.17   containerd://1.6.8
node07     Ready    <none>          39h   v1.24.4   10.252.33.52     <none>        CentOS Linux 7 (Core)   5.10.0-1.0.0.17   containerd://1.6.8
node08     Ready    <none>          39h   v1.24.4   10.252.33.45     <none>        CentOS Linux 7 (Core)   5.10.0-1.0.0.17   containerd://1.6.8
node01     Ready    <none>          39h   v1.24.4   10.252.144.206   <none>        CentOS Linux 7 (Core)   5.10.0-1.0.0.17   containerd://1.6.8
node02     Ready    <none>          39h   v1.24.4   10.252.145.13    <none>        CentOS Linux 7 (Core)   5.10.0-1.0.0.17   containerd://1.6.8
node03     Ready    <none>          39h   v1.24.4   10.252.145.163   <none>        CentOS Linux 7 (Core)   5.10.0-1.0.0.17   containerd://1.6.8
node04     Ready    <none>          39h   v1.24.4   10.252.145.226   <none>        CentOS Linux 7 (Core)   5.10.0-1.0.0.17   containerd://1.6.8

This is what happens in pod on node5 when communicate with nginx pods running on master01:

# ping works fine

bash-5.1# ping 10.233.64.103
PING 10.233.64.103 (10.233.64.103) 56(84) bytes of data.
64 bytes from 10.233.64.103: icmp_seq=1 ttl=63 time=0.214 ms
64 bytes from 10.233.64.103: icmp_seq=2 ttl=63 time=0.148 ms
--- 10.233.64.103 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1026ms
rtt min/avg/max/mdev = 0.148/0.181/0.214/0.033 ms

# curl not working

bash-5.1# curl 10.233.64.103
curl: (28) Failed to connect to 10.233.64.103 port 80 after 3069 ms: Operation timed out

# hubble observe logs(hubble observe --to-ip 10.233.64.103 -f):
Sep  6 03:15:16.100: cilium-test/testubuntu-g2gv6 (ID:9268) -> cilium-test/nginx-deployment-bpvnx (ID:4221) to-overlay FORWARDED (ICMPv4 EchoRequest)
Sep  6 03:15:16.100: cilium-test/testubuntu-g2gv6 (ID:9268) -> cilium-test/nginx-deployment-bpvnx (ID:4221) to-endpoint FORWARDED (ICMPv4 EchoRequest)
Sep  6 03:15:22.026: cilium-test/testubuntu-g2gv6:33722 (ID:9268) -> cilium-test/nginx-deployment-bpvnx:80 (ID:4221) to-overlay FORWARDED (TCP Flags: SYN)

This is what happens in pod on node4 when communicate with the same nginx pod:

# ping works fine

bash-5.1# ping 10.233.64.103
PING 10.233.64.103 (10.233.64.103) 56(84) bytes of data.
64 bytes from 10.233.64.103: icmp_seq=1 ttl=63 time=2.33 ms
64 bytes from 10.233.64.103: icmp_seq=2 ttl=63 time=2.30 ms

# curl works fine as well

bash-5.1# curl  10.233.64.103
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

# hubble observe logs(hubble observe --to-ip 10.233.64.103 -f):
Sep  6 03:16:24.808: cilium-test/testubuntu-wcwfg (ID:9268) -> cilium-test/nginx-deployment-bpvnx (ID:4221) to-overlay FORWARDED (ICMPv4 EchoRequest)
Sep  6 03:16:24.810: cilium-test/testubuntu-wcwfg (ID:9268) -> cilium-test/nginx-deployment-bpvnx (ID:4221) to-endpoint FORWARDED (ICMPv4 EchoRequest)
Sep  6 03:16:27.043: cilium-test/testubuntu-wcwfg:57802 (ID:9268) -> cilium-test/nginx-deployment-bpvnx:80 (ID:4221) to-overlay FORWARDED (TCP Flags: SYN)
Sep  6 03:16:27.045: cilium-test/testubuntu-wcwfg:57802 (ID:9268) -> cilium-test/nginx-deployment-bpvnx:80 (ID:4221) to-endpoint FORWARDED (TCP Flags: SYN)
Sep  6 03:16:27.045: cilium-test/testubuntu-wcwfg:57802 (ID:9268) -> cilium-test/nginx-deployment-bpvnx:80 (ID:4221) to-overlay FORWARDED (TCP Flags: ACK)
Sep  6 03:16:27.045: cilium-test/testubuntu-wcwfg:57802 (ID:9268) -> cilium-test/nginx-deployment-bpvnx:80 (ID:4221) to-overlay FORWARDED (TCP Flags: ACK, PSH)
Sep  6 03:16:27.047: cilium-test/testubuntu-wcwfg:57802 (ID:9268) -> cilium-test/nginx-deployment-bpvnx:80 (ID:4221) to-endpoint FORWARDED (TCP Flags: ACK)
Sep  6 03:16:27.047: cilium-test/testubuntu-wcwfg:57802 (ID:9268) -> cilium-test/nginx-deployment-bpvnx:80 (ID:4221) to-endpoint FORWARDED (TCP Flags: ACK, PSH)
Sep  6 03:16:27.048: cilium-test/testubuntu-wcwfg:57802 (ID:9268) -> cilium-test/nginx-deployment-bpvnx:80 (ID:4221) to-overlay FORWARDED (TCP Flags: ACK, FIN)
Sep  6 03:16:27.050: cilium-test/testubuntu-wcwfg:57802 (ID:9268) -> cilium-test/nginx-deployment-bpvnx:80 (ID:4221) to-endpoint FORWARDED (TCP Flags: ACK, FIN)
Sep  6 03:16:27.050: cilium-test/testubuntu-wcwfg:57802 (ID:9268) -> cilium-test/nginx-deployment-bpvnx:80 (ID:4221) to-overlay FORWARDED (TCP Flags: ACK)
Sep  6 03:16:27.051: cilium-test/testubuntu-wcwfg:57802 (ID:9268) -> cilium-test/nginx-deployment-bpvnx:80 (ID:4221) to-endpoint FORWARDED (TCP Flags: ACK)

This is the cilium-health status, also shows the port connection issues on the 4 nodes:

❯ kubectl exec -it -n kube-system ds/cilium -- cilium-health status
Defaulted container "cilium-agent" out of: cilium-agent, mount-cgroup (init), clean-cilium-state (init)
Probe time:   2022-09-06T03:10:24Z
Nodes:
  node01 (localhost):
    Host connectivity to 10.252.144.206:
      ICMP to stack:   OK, RTT=341.295µs
      HTTP to agent:   OK, RTT=100.729µs
    Endpoint connectivity to 10.233.67.53:
      ICMP to stack:   OK, RTT=334.224µs
      HTTP to agent:   OK, RTT=163.289µs
  master01:
    Host connectivity to 10.252.55.22:
      ICMP to stack:   OK, RTT=1.994728ms
      HTTP to agent:   OK, RTT=1.610932ms
    Endpoint connectivity to 10.233.64.235:
      ICMP to stack:   OK, RTT=2.100332ms
      HTTP to agent:   OK, RTT=2.489126ms
  master02:
    Host connectivity to 10.252.54.44:
      ICMP to stack:   OK, RTT=2.33033ms
      HTTP to agent:   OK, RTT=2.34166ms
    Endpoint connectivity to 10.233.65.225:
      ICMP to stack:   OK, RTT=2.101561ms
      HTTP to agent:   OK, RTT=2.067012ms
  master03:
    Host connectivity to 10.252.55.39:
      ICMP to stack:   OK, RTT=1.688641ms
      HTTP to agent:   OK, RTT=1.593428ms
    Endpoint connectivity to 10.233.66.74:
      ICMP to stack:   OK, RTT=2.210915ms
      HTTP to agent:   OK, RTT=1.725555ms
  node05:
    Host connectivity to 10.252.34.27:
      ICMP to stack:   OK, RTT=2.383001ms
      HTTP to agent:   OK, RTT=2.48362ms
    Endpoint connectivity to 10.233.70.87:
      ICMP to stack:   OK, RTT=2.194843ms
      HTTP to agent:   Get "http://10.233.70.87:4240/hello": dial tcp 10.233.70.87:4240: connect: connection timed out
  node06:
    Host connectivity to 10.252.33.44:
      ICMP to stack:   OK, RTT=2.091932ms
      HTTP to agent:   OK, RTT=1.724729ms
    Endpoint connectivity to 10.233.71.119:
      ICMP to stack:   OK, RTT=1.984056ms
      HTTP to agent:   Get "http://10.233.71.119:4240/hello": dial tcp 10.233.71.119:4240: connect: connection timed out
  node07:
    Host connectivity to 10.252.33.52:
      ICMP to stack:   OK, RTT=2.055482ms
      HTTP to agent:   OK, RTT=2.037437ms
    Endpoint connectivity to 10.233.72.47:
      ICMP to stack:   OK, RTT=1.853614ms
      HTTP to agent:   Get "http://10.233.72.47:4240/hello": dial tcp 10.233.72.47:4240: connect: connection timed out
  node08:
    Host connectivity to 10.252.33.45:
      ICMP to stack:   OK, RTT=2.461315ms
      HTTP to agent:   OK, RTT=2.369003ms
    Endpoint connectivity to 10.233.74.247:
      ICMP to stack:   OK, RTT=2.097029ms
      HTTP to agent:   Get "http://10.233.74.247:4240/hello": dial tcp 10.233.74.247:4240: connect: connection timed out
  node02:
    Host connectivity to 10.252.145.13:
      ICMP to stack:   OK, RTT=372.787µs
      HTTP to agent:   OK, RTT=168.915µs
    Endpoint connectivity to 10.233.73.98:
      ICMP to stack:   OK, RTT=360.354µs
      HTTP to agent:   OK, RTT=287.224µs
  node03:
    Host connectivity to 10.252.145.163:
      ICMP to stack:   OK, RTT=363.072µs
      HTTP to agent:   OK, RTT=216.652µs
    Endpoint connectivity to 10.233.68.73:
      ICMP to stack:   OK, RTT=312.153µs
      HTTP to agent:   OK, RTT=304.981µs
  node04:
    Host connectivity to 10.252.145.226:
      ICMP to stack:   OK, RTT=375.121µs
      HTTP to agent:   OK, RTT=185.484µs
    Endpoint connectivity to 10.233.69.140:
      ICMP to stack:   OK, RTT=403.752µs
      HTTP to agent:   OK, RTT=277.517µs

Any suggestions on where I should start troubleshooting?

Upvotes: 1

Views: 2840

Answers (2)

pchaigno
pchaigno

Reputation: 13133

It's hard to say for sure without at least the full config and a Cilium sysdump, but I suspect the issue is that some of your NIC drivers don't support XDP.

I’m not sure if it has anything to do with the hardware, but the 4 problematic nodes were gigabit NIC servers and the others were all 10 gigabit NICs.

That suggests an issue with the NIC or their drivers. The only feature in Cilium that depends on the NIC driver is XDP Acceleration.

If you have enabled that feature and the four problematic nodes have a NIC driver that doesn't support XDP (or doesn't fully support it), then it could explain why they fail to communicate with other nodes.

Upvotes: 0

Hardoman
Hardoman

Reputation: 292

Since 1.12 version they changed the routing heavily. Try to enable legacy routing.

In the helm_values.yaml (if you are using helm to deploy) you should add:

bpf:
  hostLegacyRouting: true

It configures whether direct routing mode should route traffic via host stack (true) or directly and more efficiently out of BPF (false) if the kernel supports it. The latter has the implication that it will also bypass netfilter in the host namespace.

You can read more about BPF in the official docs. Pay attention to the compatibility of the node OS with BPF

Upvotes: -1

Related Questions