Mauricio
Mauricio

Reputation: 3079

Unable to find the server at www.googleapis.com only within GCP

I know there have been a few questions similar to this issue. But in my case this issue is only happening on GCP. We have been running our services within AKS (Azure) for almost one year with not a single occurrence. Right after we moved to GCP GKE, a few requests of our Python application are falling into the error: Unable to find the server at www.googleapis.com. In most cases, the request works, so it seems to be random. I already tried to increase TCP timeouts and also the minimum Minimum ports per VM instance in my Cloud Nat. We are running the services with GKE and we have Cloud Nat Gateway setup for the Network.

Is there any exclusive setting on GCP that could be causing the issue?

Upvotes: 0

Views: 233

Answers (1)

Mauricio
Mauricio

Reputation: 3079

I figured out what was the issue. The kube-dns service was being scheduled to nodes suffering from high memory pressure, causing kube-dns to be evicted and restarted. During the time it was out some requests would not be resolved. In order to fix the issue I created a nodepool exclusive to the kube-system services, then edited the kube-system deployments and set a nodeSelector so they always get scheduled to safe Nodes. After that, the issue has ceased.

Upvotes: 1

Related Questions