Reputation: 3367
I'm setting up a Kubernetes cluster in AWS using Kops. I've got an nginx ingress controller, and I'm trying to use letsencrypt to setup tls. Right now I can't get my ingress up and running because my certificate challenges get this error:
Waiting for http-01 challenge propagation: failed to perform self check GET request 'http://critsit.io/.well-known/acme-challenge/[challengeId]': Get http://critsit.io/.well-known/acme-challenge/[challengeId]: EOF
I've got a LoadBalancer service that's taking public traffic, and the certificate issuer automatically creates 2 other services which don't have public IPs.
What am I doing wrong here? Is there some networking issue preventing the pods from finishing the acme flow? Or maybe something else?
Note: I have setup an A record in Route53 to direct traffic to the LoadBalancer.
> kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
cm-acme-http-solver-m2q2g NodePort 100.69.86.241 <none> 8089:31574/TCP 3m34s
cm-acme-http-solver-zs2sd NodePort 100.67.15.149 <none> 8089:30710/TCP 3m34s
default-http-backend NodePort 100.65.125.130 <none> 80:32485/TCP 19h
kubernetes ClusterIP 100.64.0.1 <none> 443/TCP 19h
landing ClusterIP 100.68.115.188 <none> 3001/TCP 93m
nginx-ingress LoadBalancer 100.67.204.166 [myELB].us-east-1.elb.amazonaws.com 443:30596/TCP,80:30877/TCP 19h
Here's my ingress setup:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: critsit-ingress
namespace: default
annotations:
kubernetes.io/ingress.class: "nginx"
cert-manager.io/acme-challenge-type: "http01"
cert-manager.io/cluster-issuer: "letsencrypt-prod"
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
tls:
- hosts:
- critsit.io
- app.critsit.io
secretName: letsencrypt-prod
rules:
- host: critsit.io
http:
paths:
- path: /
backend:
serviceName: landing
servicePort: 3001
And my certificate issuer:
apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
# The ACME server URL
server: https://acme-v02.api.letsencrypt.org/directory
# Email address used for ACME registration
email: [email protected]
# Name of a secret used to store the ACME account private key
privateKeySecretRef:
name: letsencrypt-prod
# Enable the HTTP-01 challenge provider
solvers:
- http01:
ingress:
class: nginx
selector: {}
Update: I've noticed that my load balancer has all of the instances marked as OutOfOrder because they're failing health checks. I wonder if that's related to the issue.
Second update: I abandoned this route altogether, and rebuilt my networking/ingress system using Istio
Upvotes: 2
Views: 2015
Reputation: 51
first step to check is the ingress controller logs if you have something like broken header then check if you load balancer act as a proxy and terminates conncetion and then establish a new connection or not
in the first case , load balancer act as a proxy please add
use-proxy-protocol = "true"
in the ingress controller configuration
if not ensure that you set it to false
for more information refer to this doc : https://docs.digitalocean.com/support/how-do-i-enable-proxy-protocol-when-my-load-balancer-sends-requests-to-the-nginx-ingress-controller/
Upvotes: 0
Reputation: 13878
The error message you are getting can mean a wide variety of issues. However, there are few things you can check/do in order to make it work:
helm install my-nginx-ingress stable/nginx-ingress
helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install cert-manager jetstack/cert-manager --namespace cert-manager --version v0.15.0 --set installCRDs=true
Make sure your traffic allows HTTP or has HTTPS with a trusted cert.
Check if hairpin mode of your loadbalancer and make sure it is working.
Add: nginx.ingress.kubernetes.io/ssl-redirect: "false"
annotation to the Ingress rule. Wait a moment and see if valid cert will be created.
You can manually manually issue certificates in your Kubernetes cluster. To do so, please follow this guide.
The problem can solve itself in time. Currently if the self check fails, it updates the status information with the reason (like: self check failed) and than tries again later (to allow for propagation). This is an expected behavior.
This is an ongoing issue that is being tracked here and here.
Upvotes: 2