Rob
Rob

Reputation: 7216

Waiting for HTTP-01 challenge propagation: wrong status code '404', expected '200'

I just set up cert-manager on Kubernetes GCP but when I check my logs I get this error:

cert-manager/challenges "msg"="propagation check failed" "error"="wrong status code '404', expected '200'" "dnsName"="api.lumiwealth.com" "resource_kind"="Challenge" "resource_name"="test-certificate-h4m8c-1804713970-576085961" "resource_namespace"="backend" "resource_version"="v1" "type"="HTTP-01"

From what I can tell the issue is that the ingress that gets created does not have access to the external internet. I confirmed this by running this in Terminal:

curl http://api.lumiwealth.com/.well-known/acme-challenge/vhoLg-lNAgXAwEJlknfBbRlYuKuHBakgeG_d40c09Zk

Which returns:

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Error</title>
</head>
<body>
<pre>Cannot GET /.well-known/acme-challenge/vhoLg-lNAgXAwEJlknfBbRlYuKuHBakgeG_d40c09Zk</pre>
</body>
</html>

Here are my YAML files:

Issuer:

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata: 
  name: letsencrypt-prod
spec: 
  acme: 
    email: "[email protected]"
    privateKeySecretRef: 
      name: letsencrypt-prod
    server: "https://acme-v02.api.letsencrypt.org/directory"
    solvers:
      - http01:
          ingress:
            class: ingress-gce

Test certificate:

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: test-certificate
  namespace: backend
spec:
  secretName: certificate-test
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer
  dnsNames:
  - api.lumiwealth.com

When I kubectl apply the certificate it creates an ingress in GCP that looks like this (but doesn't seem to have network access? I'm not sure how it could have possibly gotten the IP address from my DNS)

enter image description here

Any ideas what I'm missing?

Upvotes: 5

Views: 11410

Answers (4)

k_o_
k_o_

Reputation: 6298

In my case the problem did arise because my IP address was already bound to a different Nginx controller in a different cluster in GKE. Hence the ingress-nginx-controller could never attach to the IP, which led to the problem that the challenge for the cert-manager was always hitting a 404 in the other Nginx environment. After releasing the IP it was working. I'm using a standard cluster.

Upvotes: 1

JGutierrezC
JGutierrezC

Reputation: 4533

For me it was because in the DNS record i was aiming for staging instead of production.

run curl -v https://yourdomainnamehere.com and check the if the domain name matches the subject.

Upvotes: 0

Steven Yong
Steven Yong

Reputation: 5446

As for me, spent a few hours troubleshooting this.

I have the same issue and I am using nginx controller and the ingress is managed by ArgoCD where Sync Policy is automatic and I have put this line in my ingress manifest:

acme.cert-manager.io/http01-edit-in-place: "true"

This has caused the challenge route to be replaced just right after it was added into the ingress.

So when Let's Encrypt is hitting that path, it actually went into the my system pod instead of the acme http solver pod and of course my pod wouldn't have the route and gave a 404, hence the error Waiting for HTTP-01 challenge propagation: wrong status code '404', expected '200'

While checking the logs in ingress controller helps, it was when I realise the checks actually went into my service pod have me to realise this.

Odd case but hope it still helps some one out there.

Upvotes: 0

DanF
DanF

Reputation: 364

I believe the issue is a routing issue rather than a network issue.

When you query

curl http://api.lumiwealth.com/.well-known/acme-challenge/vhoLg-lNAgXAwEJlknfBbRlYuKuHBakgeG_d40c09Zk

This does indeed work and can exit the cluster and the broader internet. What that query does is that then it tries to access the challenge file within the cluster.

Would you kindly check for the values of

kubectl -A get challenges

To make sure that there is only one set of challenges?, if there are more you may want to delete everything and start over.

So all you have to do is modify your ingress routes to capture the route

.well-known/acme-challenge/*

This has to be routed to the ACME solver pod/service within your cluster.

The basic troubleshooting steps for https01 Docs

  • You can access the URL from the public internet
  • The ACME solver pod is up and running
  • Use kubectl describe ingress to check the status of the HTTP01 solver ingress. (unless you use acme.cert-manager.io/http01-edit-in-place, then check the same ingress as your domain)

Upvotes: 1

Related Questions