Reputation: 7216
I just set up cert-manager on Kubernetes GCP but when I check my logs I get this error:
cert-manager/challenges "msg"="propagation check failed" "error"="wrong status code '404', expected '200'" "dnsName"="api.lumiwealth.com" "resource_kind"="Challenge" "resource_name"="test-certificate-h4m8c-1804713970-576085961" "resource_namespace"="backend" "resource_version"="v1" "type"="HTTP-01"
From what I can tell the issue is that the ingress that gets created does not have access to the external internet. I confirmed this by running this in Terminal:
curl http://api.lumiwealth.com/.well-known/acme-challenge/vhoLg-lNAgXAwEJlknfBbRlYuKuHBakgeG_d40c09Zk
Which returns:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Error</title>
</head>
<body>
<pre>Cannot GET /.well-known/acme-challenge/vhoLg-lNAgXAwEJlknfBbRlYuKuHBakgeG_d40c09Zk</pre>
</body>
</html>
Here are my YAML files:
Issuer:
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
email: "[email protected]"
privateKeySecretRef:
name: letsencrypt-prod
server: "https://acme-v02.api.letsencrypt.org/directory"
solvers:
- http01:
ingress:
class: ingress-gce
Test certificate:
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: test-certificate
namespace: backend
spec:
secretName: certificate-test
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
dnsNames:
- api.lumiwealth.com
When I kubectl apply
the certificate it creates an ingress in GCP that looks like this (but doesn't seem to have network access? I'm not sure how it could have possibly gotten the IP address from my DNS)
Any ideas what I'm missing?
Upvotes: 5
Views: 11410
Reputation: 6298
In my case the problem did arise because my IP address was already bound to a different Nginx controller in a different cluster in GKE. Hence the ingress-nginx-controller
could never attach to the IP, which led to the problem that the challenge for the cert-manager was always hitting a 404 in the other Nginx environment. After releasing the IP it was working. I'm using a standard cluster.
Upvotes: 1
Reputation: 4533
For me it was because in the DNS record i was aiming for staging instead of production.
run curl -v https://yourdomainnamehere.com
and check the if the domain name matches the subject.
Upvotes: 0
Reputation: 5446
As for me, spent a few hours troubleshooting this.
I have the same issue and I am using nginx controller and the ingress is managed by ArgoCD where Sync Policy is automatic and I have put this line in my ingress manifest:
acme.cert-manager.io/http01-edit-in-place: "true"
This has caused the challenge route to be replaced just right after it was added into the ingress.
So when Let's Encrypt is hitting that path, it actually went into the my system pod instead of the acme http solver pod and of course my pod wouldn't have the route and gave a 404, hence the error Waiting for HTTP-01 challenge propagation: wrong status code '404', expected '200'
While checking the logs in ingress controller helps, it was when I realise the checks actually went into my service pod have me to realise this.
Odd case but hope it still helps some one out there.
Upvotes: 0
Reputation: 364
I believe the issue is a routing issue rather than a network issue.
When you query
curl http://api.lumiwealth.com/.well-known/acme-challenge/vhoLg-lNAgXAwEJlknfBbRlYuKuHBakgeG_d40c09Zk
This does indeed work and can exit the cluster and the broader internet. What that query does is that then it tries to access the challenge file within the cluster.
Would you kindly check for the values of
kubectl -A get challenges
To make sure that there is only one set of challenges?, if there are more you may want to delete everything and start over.
So all you have to do is modify your ingress routes to capture the route
.well-known/acme-challenge/*
This has to be routed to the ACME solver pod/service within your cluster.
The basic troubleshooting steps for https01 Docs
Upvotes: 1