glux
glux

Reputation: 532

Google Cloud Load Balancer - 502 - Unmanaged instance group failing health checks

I currently have an HTTPS Load Balancer setup operating with a 443 Frontend, Backend and Health Check that serves a single host nginx instance.

When navigating directly to the host via browser the page loads correctly with valid SSL certs.

When trying to access the site through the load balancer IP, I receive a 502 - Server error message. I check the Google logs and I notice "failed_to_pick_backend" errors at the load balancer. I also notice that it failing health checks.

Some digging around leads me to these two links: https://cloudplatform.googleblog.com/2015/07/Debugging-Health-Checks-in-Load-Balancing-on-Google-Compute-Engine.html

https://github.com/coreos/bugs/issues/1195

Issue #1 - Not sure if google-address-manager is running on the server (RHEL 7). I do not see an entry for the HTTPS load balancer IP in the routes. The Google SDK is installed. This is a Google-provided image and if I update the IP address in the console, it also gets updated on the host. How do I check if google-address-manager is running on RHEL7?

[root@server]# ip route ls table local type local scope host
10.212.2.40 dev eth0 proto kernel src 10.212.2.40
127.0.0.0/8 dev lo proto kernel src 127.0.0.1
127.0.0.1 dev lo proto kernel src 127.0.0.1

Output of all google services

[root@server]# systemctl list-unit-files
google-accounts-daemon.service                enabled
google-clock-skew-daemon.service              enabled
google-instance-setup.service                 enabled
google-ip-forwarding-daemon.service           enabled
google-network-setup.service                  enabled
google-shutdown-scripts.service               enabled
google-startup-scripts.service                enabled

Issue #2: Not receiving a 200 OK response. The certificate is valid and the same on both the LB and server. When running curl against the app server I receive this response.

[email protected]  curl -I https://app-server.com
curl: (60) SSL certificate problem: unable to get local issuer certificate
More details here: https://curl.haxx.se/docs/sslcerts.html

Thoughts?

Upvotes: 2

Views: 1359

Answers (2)

glux
glux

Reputation: 532

A couple of updates and lessons learned:

I have found out that "google-address-manager" is now deprecated and replaced by "google-ip-forward-daemon" which is running.

[root@server ~]# sudo service google-ip-forwarding-daemon status
Redirecting to /bin/systemctl status google-ip-forwarding-daemon.service
 google-ip-forwarding-daemon.service - Google Compute Engine IP Forwarding Daemon
   Loaded: loaded (/usr/lib/systemd/system/google-ip-forwarding-daemon.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2017-12-22 20:45:27 UTC; 17h ago
 Main PID: 1150 (google_ip_forwa)
   CGroup: /system.slice/google-ip-forwarding-daemon.service
           └─1150 /usr/bin/python /usr/bin/google_ip_forwarding_daemon

There is an active firewall rule allowing IP ranges 130.211.0.0/22 and 35.191.0.0/16 for port 443. The target is also properly set.

Finally, the health check is currently using the default "/" path. The developers have put an authentication in front of the site during the development process. If I bypassed the SSL cert error, I received a 401 unauthorized when running curl. This was the root cause of the issue we were experiencing. To remedy, we modified nginx basic authentication configuration to disable authentication to a new route (eg. /health)

Once nginx configuration was updated and the path was updated to the new /health route at the health check, we were receivied valid 200 responses. This allowed the health check to return healthy instances and allowed the LB to pass through traffic

Upvotes: 0

Gal Ben-Haim
Gal Ben-Haim

Reputation: 17803

You should add firewall rules for the health check service - https://cloud.google.com/compute/docs/load-balancing/health-checks#health_check_source_ips_and_firewall_rules and make sure that your backend service listens on the load balancer ip (easiest is bind to 0.0.0.0) - this is definitely true for an internal load balancer, not sure about HTTPS with an external ip.

Upvotes: 1

Related Questions