ZECTBynmo
ZECTBynmo

Reputation: 3367

Kubernetes: Why are my acme challenges getting EOF/no response?

I'm setting up a Kubernetes cluster in AWS using Kops. I've got an nginx ingress controller, and I'm trying to use letsencrypt to setup tls. Right now I can't get my ingress up and running because my certificate challenges get this error:

Waiting for http-01 challenge propagation: failed to perform self check GET request 'http://critsit.io/.well-known/acme-challenge/[challengeId]': Get http://critsit.io/.well-known/acme-challenge/[challengeId]: EOF

I've got a LoadBalancer service that's taking public traffic, and the certificate issuer automatically creates 2 other services which don't have public IPs.

What am I doing wrong here? Is there some networking issue preventing the pods from finishing the acme flow? Or maybe something else?

Note: I have setup an A record in Route53 to direct traffic to the LoadBalancer.

> kubectl get services
NAME                        TYPE           CLUSTER-IP       EXTERNAL-IP                                                               PORT(S)                      AGE
cm-acme-http-solver-m2q2g   NodePort       100.69.86.241    <none>                                                                    8089:31574/TCP               3m34s
cm-acme-http-solver-zs2sd   NodePort       100.67.15.149    <none>                                                                    8089:30710/TCP               3m34s
default-http-backend        NodePort       100.65.125.130   <none>                                                                    80:32485/TCP                 19h
kubernetes                  ClusterIP      100.64.0.1       <none>                                                                    443/TCP                      19h
landing                     ClusterIP      100.68.115.188   <none>                                                                    3001/TCP                     93m
nginx-ingress               LoadBalancer   100.67.204.166   [myELB].us-east-1.elb.amazonaws.com                                       443:30596/TCP,80:30877/TCP   19h

Here's my ingress setup:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: critsit-ingress
  namespace: default
  annotations:
    kubernetes.io/ingress.class: "nginx"
    cert-manager.io/acme-challenge-type: "http01"
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  tls:
    - hosts:
      - critsit.io
      - app.critsit.io
      secretName: letsencrypt-prod
  rules:
    - host: critsit.io
      http:
        paths:
          - path: /
            backend:
              serviceName: landing
              servicePort: 3001

And my certificate issuer:

apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    # The ACME server URL
    server: https://acme-v02.api.letsencrypt.org/directory
    # Email address used for ACME registration
    email: [email protected]
    # Name of a secret used to store the ACME account private key
    privateKeySecretRef:
      name: letsencrypt-prod
    # Enable the HTTP-01 challenge provider
    solvers:
    - http01:
        ingress:
          class:  nginx
      selector: {}

Update: I've noticed that my load balancer has all of the instances marked as OutOfOrder because they're failing health checks. I wonder if that's related to the issue.

Second update: I abandoned this route altogether, and rebuilt my networking/ingress system using Istio

Upvotes: 2

Views: 2015

Answers (2)

zakariamansori
zakariamansori

Reputation: 51

first step to check is the ingress controller logs if you have something like broken header then check if you load balancer act as a proxy and terminates conncetion and then establish a new connection or not

in the first case , load balancer act as a proxy please add

use-proxy-protocol = "true" 

in the ingress controller configuration

if not ensure that you set it to false

for more information refer to this doc : https://docs.digitalocean.com/support/how-do-i-enable-proxy-protocol-when-my-load-balancer-sends-requests-to-the-nginx-ingress-controller/

proxy lbs

Upvotes: 0

Wytrzymały Wiktor
Wytrzymały Wiktor

Reputation: 13878

The error message you are getting can mean a wide variety of issues. However, there are few things you can check/do in order to make it work:

  1. Delete the Ingress, the certificates and the cert-manager fully. After that add them all back to make sure it installs clean. Sometimes stale certs or bad/multi Ingress pathing might be the issue. For example you can use Helm:

helm install my-nginx-ingress stable/nginx-ingress
helm repo add jetstack https://charts.jetstack.io
helm repo update 
helm install  cert-manager jetstack/cert-manager  --namespace cert-manager --version v0.15.0 --set installCRDs=true

  1. Make sure your traffic allows HTTP or has HTTPS with a trusted cert.

  2. Check if hairpin mode of your loadbalancer and make sure it is working.

  3. Add: nginx.ingress.kubernetes.io/ssl-redirect: "false" annotation to the Ingress rule. Wait a moment and see if valid cert will be created.

  4. You can manually manually issue certificates in your Kubernetes cluster. To do so, please follow this guide.

  5. The problem can solve itself in time. Currently if the self check fails, it updates the status information with the reason (like: self check failed) and than tries again later (to allow for propagation). This is an expected behavior.

This is an ongoing issue that is being tracked here and here.

Upvotes: 2

Related Questions