Reputation: 453
My health checks fail with the following setup.
nginx.conf
user root;
worker_processes auto;
error_log /var/log/nginx/error.log warn;
events {
worker_connections 1024;
}
http {
server {
listen 80;
server_name subdomain.domain.com
auth_basic "Restricted";
auth_basic_user_file /etc/nginx/.htpasswd;
}
server {
listen 80;
auth_basic off;
}
server {
listen 2222;
auth_basic off;
location /healthz {
return 200;
}
}
}
DOCKERFILE
FROM nginx:alpine
COPY index.html /usr/share/nginx/html/index.html
VOLUME /usr/share/nginx/html
COPY /server/nginx.conf /etc/nginx/
COPY /server/htpasswd /etc/nginx/.htpasswd
CMD ["nginx", "-g", "daemon off;"]
EXPOSE 80
EXPOSE 2222
deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
namespace: my-namespace
labels:
app: my-app
spec:
replicas: 1
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-app
image: gcr.io/GOOGLE_CLOUD_PROJECT/my-app
ports:
- containerPort: 80
- containerPort: 2222
livenessProbe:
httpGet:
path: /healthz
port: 2222
readinessProbe:
httpGet:
path: /healthz
port: 2222
It definitely works when I delete the "server_name" row in nginx.conf and delete the second server block. Could this be an issue with ingress/load balancer, since I do not know how long it takes to update (I experienced a healthy pod go unhealthy after a few minutes yesterday). Running it on Google Kubernetes Engine (GKE) with Google's own ingress controller (not NGINX ingress!)
What am I doing wrong?
Upvotes: 1
Views: 5823
Reputation: 453
The issue was that GKE's load balancer does its own health checks. These look at / by default and expect a 200 in return. Only when health checks in the deployment/pod have another path declared, the load balancer health check will pick up those paths.
The Load Balancer is provisioned after ingress YAML is applied. Any changes in the deployment or ingress that affect the load balancer will not be accepted as long as the load balancer runs. This means I had to delete the load balancer first and then apply the deployment, service and ingress YAMLs (ingress automatically sets up the load balancer then). Instead of deleting the load balancer one can enter the correct path manually (and wait a few minutes).
Since it seems the load balancer does health checks on each open port, I deleted my 2222 port and added location /healthz to each server block with port 80 in nginx with auth_basic off.
See: https://cloud.google.com/load-balancing/docs/health-check-concepts and https://stackoverflow.com/a/61222826/2534357 and https://stackoverflow.com/a/38511357/2534357
New nginx.conf
user root;
worker_processes auto;
error_log /var/log/nginx/error.log warn;
events {
worker_connections 1024;
}
http {
server {
listen 80;
server_name subdomain1.domain.com;
root /usr/share/nginx/html;
index index.html;
auth_basic "Restricted";
auth_basic_user_file /etc/nginx/.htpasswd_subdomain1;
location /healthz {
auth_basic off;
allow all;
return 200;
}
}
server {
listen 80;
server_name subdomain2.domain.com;
root /usr/share/nginx/html;
index index.html;
auth_basic "Restricted";
auth_basic_user_file /etc/nginx/.htpasswd_subdomain2;
location /healthz {
auth_basic off;
allow all;
return 200;
}
}
server {
listen 80;
server_name domain.com www.domain.com;
root /usr/share/nginx/html;
index index.html;
auth_basic "Restricted";
auth_basic_user_file /etc/nginx/.htpasswd_domain;
location /healthz {
auth_basic off;
allow all;
return 200;
}
}
## next block probably not necessary
server {
listen 80;
auth_basic off;
location /healthz {
return 200;
}
}
}
my new deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
namespace: my-namespace
labels:
app: my-app
spec:
replicas: 1
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-app
image: gcr.io/GOOGLE_CLOUD_PROJECT/my-app
ports:
- containerPort: 80
livenessProbe:
httpGet:
path: /healthz
port: 80
readinessProbe:
httpGet:
path: /healthz
port: 80
Upvotes: 5