PhotonTamer
PhotonTamer

Reputation: 1137

Nginx Ingress Controller - Failed Calling Webhook

I set up a k8s cluster using kubeadm (v1.18) on an Ubuntu virtual machine. Now I need to add an Ingress Controller. I decided for nginx (but I'm open for other solutions). I installed it according to the docs, section "bare-metal":

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-0.31.1/deploy/static/provider/baremetal/deploy.yaml

The installation seems fine to me:

kubectl get all -n ingress-nginx
NAME                                            READY   STATUS      RESTARTS   AGE
pod/ingress-nginx-admission-create-b8smg        0/1     Completed   0          8m21s
pod/ingress-nginx-admission-patch-6nbjb         0/1     Completed   1          8m21s
pod/ingress-nginx-controller-78f6c57f64-m89n8   1/1     Running     0          8m31s

NAME                                         TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
service/ingress-nginx-controller             NodePort    10.107.152.204   <none>        80:32367/TCP,443:31480/TCP   8m31s
service/ingress-nginx-controller-admission   ClusterIP   10.110.191.169   <none>        443/TCP                      8m31s

NAME                                       READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/ingress-nginx-controller   1/1     1            1           8m31s

NAME                                                  DESIRED   CURRENT   READY   AGE
replicaset.apps/ingress-nginx-controller-78f6c57f64   1         1         1       8m31s

NAME                                       COMPLETIONS   DURATION   AGE
job.batch/ingress-nginx-admission-create   1/1           2s         8m31s
job.batch/ingress-nginx-admission-patch    1/1           3s         8m31s

However, when trying to apply a custom Ingress, I get the following error:

Error from server (InternalError): error when creating "yaml/xxx/xxx-ingress.yaml": Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post https://ingress-nginx-controller-admission.ingress-nginx.svc:443/extensions/v1beta1/ingresses?timeout=30s: Temporary Redirect

Any idea what could be wrong?

I suspected DNS, but other NodePort services are working as expected and DNS works within the cluster.

The only thing I can see is that I don't have a default-http-backend which is mentioned in the docs here. However, this seems normal in my case, according to this thread.

Last but not least, I tried as well the installation with manifests (after removing ingress-nginx namespace from previous installation) and the installation via Helm chart. It has the same result.

I'm pretty much a beginner on k8s and this is my playground-cluster. So I'm open to alternative solutions as well, as long as I don't need to set up the whole cluster from scratch.

Update: With "applying custom Ingress", I mean: kubectl apply -f <myIngress.yaml>

Content of myIngress.yaml

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: my-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  rules:
  - http:
      paths:
      - path: /someroute/fittingmyneeds
        pathType: Prefix
        backend:
          serviceName: some-service
          servicePort: 5000

Upvotes: 86

Views: 166726

Answers (18)

J K
J K

Reputation: 1677

In my case I'd mixed the installations up. I resolved the issue by executing the following steps:

$ kubectl get validatingwebhookconfigurations 

I iterated through the list of configurations received from the above steps and deleted the configuration using

$ kubectl delete validatingwebhookconfigurations [configuration-name]

Upvotes: 67

Soheil
Soheil

Reputation: 1

Just use v1 instead of v1beta1 in deploy.yaml

Upvotes: 0

user3258557
user3258557

Reputation: 31

In my case it was iptables not allowing 8443 through the node firewall.

After looking it over and over, I then realizing that the nginx admission service node selectors were targeting the external IP addressed of my nodes, and translating the request from 443 via the service to 8443 on the external node IP I dump the live iptable state, so I didnt break any existing rules and added the required 8443 ACCEPT line then restored its config.

Dont forget if you have a main iptables file that loads on startup to add it too

# iptables-save > ~/iptables-current
# nano iptables-current  # add to the existing initial chain

  -A INPUT -p tcp -m conntrack --ctstate NEW -m tcp --dport 8443 -j ACCEPT -m comment --comment "Ingress controller webhooks  (HTTPS)."

# iptables-restore < ~/iptables-current
# nano /etc/sysconfig/iptables     # add to the existing initial chain

  -A INPUT -p tcp -m conntrack --ctstate NEW -m tcp --dport 8443 -j ACCEPT -m comment --comment "Ingress controller webhooks (HTTPS)."

another possible solution is to whitelist your nodes IP's (or scope) so you do not have to expose it outside the worker node pool and allow for any communication within and outside the k8s network)

same procedure as above, only using something like this

-A INPUT -p tcp -d 10.10.10.160/29 -j ACCEPT

Upvotes: 0

Pradeep Gangawane
Pradeep Gangawane

Reputation: 1

I have solve this by removing ValidatingWebhookConfiguration which is created by previous ingress-nginx deployment in default namespace. ValidatingWebhookConfiguration must be exist one in default namespace otherwise will conflict each other. plz check below:

Error from server (InternalError): error when creating "ingress.yaml": Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": failed to call webhook: Post "https://sr-nginx-ingress-nginx-controller-admission.ingress-nginx.svc:443/networking/v1/ingresses?timeout=10s": service "sr-nginx-ingress-nginx-controller-admission" not found

pradip@linux:~$ kubectl get -A ValidatingWebhookConfiguration NAME WEBHOOKS AGE aks-node-validating-webhook 1 11d ingress-nginx-admission 1 3h47m sr-nginx-ingress-nginx-admission 1 11d

pradip@linux:~$ kubectl delete ValidatingWebhookConfiguration sr-nginx-ingress-nginx-admission validatingwebhookconfiguration.admissionregistration.k8s.io "sr-nginx-ingress-nginx-admission" deleted

pradip@linux:~$ kubectl apply -f ingress.yaml ingress.networking.k8s.io/ingress-reddit-app created

Upvotes: 0

Mauricio
Mauricio

Reputation: 3079

In my case I didn't need to delete the ValidatingWebhookConfiguration. The issue was that I was using a private cluster on GCP version 1.17.14-gke.1600. For gke private clusters, the validatingwebhook tries to access master node API at port 8443. In my case it was failing because I did not have that port allowed at the firewall. So the current workaround for that, as recommended by Google itself (but very poorly documented) is adding a Firewall rule on GCP, that will allow inbound (Ingress) TCP requests to your master node at port 8443, so that the other nodes within the cluster can reach the master for validatingwebhook API running on it with that very port.

As to how to create the rule, this is how I did it:

  1. Went to Firewall Rules and added a new one.
  2. At the field Network I selected the VPC from which my cluster is.
  3. Direction of traffic I set as Ingress
  4. Action on match to Allow
  5. Targets to Specified target tags
  6. The Target tags can be found on the master node details in a property called Network tags. To find it, I opened a new window, went to my cluster node pools, found the master node pool. Then entered one of the nodes to look for the Virtual Machine details. There I found Network Tags. Copied its value and went back to the Firewall Rule form.
  7. Pasted the copied network tag to the tag field
  8. At Protocols and ports, checked the Specified protocols and ports
  9. Then checked TCP and placed 8443
  10. Saved the rule and applied the manifest again.

NOTE: Most threads out there will say it's the port 9443. It may work. But I first attempted 8443 since it was reported to work on this thread. It worked for me so I didn't even try 9443.

Upvotes: 27

110100100
110100100

Reputation: 199

To add a terraform example for GCP, extending @mauricio

resource "google_container_cluster" "primary" {
...
}


resource "google_compute_firewall" "validate_nginx" {
  project = local.project
  name    = "validate-nginx"
  network = google_compute_network.vpc.name
  allow {
    protocol = "tcp"
    ports    = ["8443"]
  }
  direction = "INGRESS"
  source_ranges = [google_container_cluster.primary.private_cluster_config[0].master_ipv4_cidr_block]
}

Upvotes: 1

xirehat
xirehat

Reputation: 1629

This is a solution for those using GKE cluster.

I tested two ways to fix this issue.

  • Terraform
  • GCP Console

Terraform

resource "google_compute_firewall" "validate-nginx" {
  project = "${YOUR_PROJECT_ID}"
  name    = "access-master-to-validatenginx"
  network = "${YOUR_NETWORK}"
  
  allow {
    protocol = "tcp"
    ports    = ["8443"]
  }

  target_tags   = ["${NODE_NETWORK_TAG}"]
  source_ranges = ["${CONTROL_PLANE_ADDRESS_RANGE}"]
}

GCP Console

enter image description here

Upvotes: 0

Adiii
Adiii

Reputation: 59896

In my case, it was the AWS EKS module, which now comes with harden security group. but nginx-ingress requires the cluster to communicate with the ingress controller so I have to whitelist below port in the node security group

  node_security_group_additional_rules = {
    cluster_to_node = {
      description      = "Cluster to ingress-nginx webhook"
      protocol         = "-1"
      from_port        = 8443
      to_port          = 8443
      type             = "ingress"
      source_cluster_security_group = true
    }
  }

input_node_security_group_additional_rules

Upvotes: 4

philo
philo

Reputation: 141

If using terraform and helm disable the Validating Webhook

resource "helm_release" "nginx_ingress" {

...

  set {
    name  = "controller.admissionWebhooks.enabled"
    value = "false"
  }

...

}

Upvotes: 3

Friedrich Brunzema
Friedrich Brunzema

Reputation: 303

I had this error. Basically I have a script installing the nginx controller with helm; the script then immediately installs an application that uses ingress, also with helm. That app install failed, just the ingress part.

Solution was to wait 60s after the install of the nginx, to give the WebAdmissionHook time to come up and be ready.

Upvotes: 1

Gilad Sharaby
Gilad Sharaby

Reputation: 998

Might be because of a previous nginx-ingress-controller configuration.
You can try to run the following command -

kubectl delete -A ValidatingWebhookConfiguration ingress-nginx-admission

Upvotes: 23

systemBuilder
systemBuilder

Reputation: 37

I was bringing up a cluster with a known-good configuration and another had been created just last week in essentially the same way. And my error message was a little more specific about what failed in the webhook :

│ Error: Failed to create Ingress
'auth-system/alertmanager-oauth2-proxy' 
because: Internal error occurred: failed calling webhook
"validate.nginx.ingress.kubernetes.io": Post
"https://nginx-nginx-ingress-controller-controller-admission.ingress-nginx.svc:443/networking/v1beta1/ingresses?timeout=10s":
x509: certificate signed by unknown authority

It turns out that in my many configs, one of them had a typo in the DNS names input to nginx creation. So nginx thought it had one domain name, but it got a certificate for a slightly different dns name, which caused the validating web hook to fail.

The solution was not to delete the hook, but to address the underlying config problem in nginx dns so that it matched its X.509 certificate domain.

Upvotes: 0

MichaelMoser
MichaelMoser

Reputation: 3500

what worked for me was to increase the timeout while waiting for ingress to come up.

Upvotes: 0

petermicuch
petermicuch

Reputation: 136

I am not sure if this helps this late, but might it be, that your cluster was behind proxy? Because in that case you have to have no_proxy configured correctly. Specifically, it has to include .svc,.cluster.local otherwise validation webhook requests such as https://ingress-nginx-controller-admission.ingress-nginx.svc:443/extensions/v1beta1/ingresses?timeout=30s will be routed via proxy server (note that .svc in the URL).

I had exactly this issue and adding .svc into no_proxy variable helped. You can try this out quickly by modifying /etc/kubernetes/manifests/kube-apiserver.yaml file which will in turn automatically recreate your kubernetes api server pod.

This is not the case just for ingress validation, but also for other things that might refer URL in your cluster ending with .svc or .namespace.svc.cluster.local (i.e. see this bug)

Upvotes: 6

h q
h q

Reputation: 1500

On a baremetal cluster, I disabled the admissionWebhooks during the Helm3 install:

kubectl create ns ingress-nginx

helm install [RELEASE_NAME] ingress-nginx/ingress-nginx -n ingress-nginx --set controller.admissionWebhooks.enabled=false

Upvotes: 7

Oleg Konstantinov
Oleg Konstantinov

Reputation: 147

I've solved this issue. The problem was that you use Kubernetes version 1.18, but the ValidatingWebhookConfiguration in current ingress-Nginx uses the oldest API; see the doc: https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#prerequisites

Ensure that the Kubernetes cluster is at least as new as v1.16 (to use admissionregistration.k8s.io/v1), or v1.9 (to use admissionregistration.k8s.io/v1beta1).

And in current yaml :

 # Source: ingress-nginx/templates/admission-webhooks/validating-webhook.yaml
    # before changing this value, check the required kubernetes version
    # https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#prerequisites
apiVersion: admissionregistration.k8s.io/v1beta1

and in rules :

apiVersions:
          - v1beta1

So you need to change it on v1 :

apiVersion: admissionregistration.k8s.io/v1

and add rule -v1 :

apiVersions:
          - v1beta1
          - v1

After you change it and redeploy -your custom ingress service will deploy sucessfull

Upvotes: 12

Patrick Gardella
Patrick Gardella

Reputation: 4441

Another option you have is to remove the Validating Webhook entirely:

kubectl delete -A ValidatingWebhookConfiguration ingress-nginx-admission

I found I had to do that on another issue, but the workaround/solution works here as well.

This isn't the best answer; the best answer is to figure out why this doesn't work. But at some point, you live with workarounds.

I'm installing on Docker for Mac, so I used the cloud rather than baremetal version:

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v0.34.1/deploy/static/provider/cloud/deploy.yaml

Upvotes: 104

PhotonTamer
PhotonTamer

Reputation: 1137

Finally, I managed to run Ingress Nginx properly by changing the way of installation. I still don't understand why the previous installation didn't work, but I'll share nevertheless the solution along with some more insights into the original problem.

Solution

Uninstall ingress nginx: Delete the ingress-nginx namespace. This does not remove the validating webhook configuration - delete this one manually. Then install MetalLB and install ingress nginx again. I now used the version from the Helm stable repo. Now everything works as expected. Thanks to Long on the kubernetes slack channel!

Some more insights into the original problem

The yamls provided by the installation guide contain a ValidatingWebHookConfiguration:

apiVersion: admissionregistration.k8s.io/v1beta1
kind: ValidatingWebhookConfiguration
metadata:
  labels:
    helm.sh/chart: ingress-nginx-2.0.3
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.32.0
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/component: admission-webhook
  name: ingress-nginx-admission
  namespace: ingress-nginx
webhooks:
  - name: validate.nginx.ingress.kubernetes.io
    rules:
      - apiGroups:
          - extensions
          - networking.k8s.io
        apiVersions:
          - v1beta1
        operations:
          - CREATE
          - UPDATE
        resources:
          - ingresses
    failurePolicy: Fail
    clientConfig:
      service:
        namespace: ingress-nginx
        name: ingress-nginx-controller-admission
        path: /extensions/v1beta1/ingresses

Validation is performed whenever I create or update an ingress (the content of my ingress.yaml doesn't matter). The validation failed, because when calling the service, the response is a Temporary Redirect. I don't know why. The corresponding service is:

apiVersion: v1
kind: Service
metadata:
  labels:
    helm.sh/chart: ingress-nginx-2.0.3
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.32.0
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/component: controller
  name: ingress-nginx-controller-admission
  namespace: ingress-nginx
spec:
  type: ClusterIP
  ports:
    - name: https-webhook
      port: 443
      targetPort: webhook
  selector:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/component: controller

The pod matching the selector comes from this deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    helm.sh/chart: ingress-nginx-2.0.3
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.32.0
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/component: controller
  name: ingress-nginx-controller
  namespace: ingress-nginx
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: ingress-nginx
      app.kubernetes.io/instance: ingress-nginx
      app.kubernetes.io/component: controller
  revisionHistoryLimit: 10
  minReadySeconds: 0
  template:
    metadata:
      labels:
        app.kubernetes.io/name: ingress-nginx
        app.kubernetes.io/instance: ingress-nginx
        app.kubernetes.io/component: controller
    spec:
      dnsPolicy: ClusterFirst
      containers:
        - name: controller
          image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.32.0
          imagePullPolicy: IfNotPresent
          lifecycle:
            preStop:
              exec:
                command:
                  - /wait-shutdown
          args:
            - /nginx-ingress-controller
            - --election-id=ingress-controller-leader
            - --ingress-class=nginx
            - --configmap=ingress-nginx/ingress-nginx-controller
            - --validating-webhook=:8443
            - --validating-webhook-certificate=/usr/local/certificates/cert
            - --validating-webhook-key=/usr/local/certificates/key
          securityContext:
            capabilities:
              drop:
                - ALL
              add:
                - NET_BIND_SERVICE
            runAsUser: 101
            allowPrivilegeEscalation: true
          env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
          livenessProbe:
            httpGet:
              path: /healthz
              port: 10254
              scheme: HTTP
            initialDelaySeconds: 10
            periodSeconds: 10
            timeoutSeconds: 1
            successThreshold: 1
            failureThreshold: 3
          readinessProbe:
            httpGet:
              path: /healthz
              port: 10254
              scheme: HTTP
            initialDelaySeconds: 10
            periodSeconds: 10
            timeoutSeconds: 1
            successThreshold: 1
            failureThreshold: 3
          ports:
            - name: http
              containerPort: 80
              protocol: TCP
            - name: https
              containerPort: 443
              protocol: TCP
            - name: webhook
              containerPort: 8443
              protocol: TCP
          volumeMounts:
            - name: webhook-cert
              mountPath: /usr/local/certificates/
              readOnly: true
          resources:
            requests:
              cpu: 100m
              memory: 90Mi
      serviceAccountName: ingress-nginx
      terminationGracePeriodSeconds: 300
      volumes:
        - name: webhook-cert
          secret:
            secretName: ingress-nginx-admission

Something in this validation chain goes wrong. Would be interesting to know, what and why, but I can continue working with my MetalLB solution. Note that this solution does not contain a validating webhook at all.

Upvotes: 9

Related Questions