Seidz_q
Seidz_q

Reputation: 11

Why can't prometheus blackbox exporter verify a tls endpoints self signed certificate? Details below

We have an openshift cluster in which the prometheus operator monitoring stack is installed. We would like to probe the actuator/health endpoints of Spring Boot applications using blackbox exporter.

Here's what I've done so far:

Deployed blackbox exporter in the namespace we use for the prometheus operator. Service and ConfigMap is ready, a http_2xx module is defined in the configMap, the exporter is running. I have 2 namespaces (or projects) which have 1-1 application deployed in them, these are the same apps. I created a Probe in one namespace and a serviceMonitor in the other namespace. The probe uses a staticTarget config to probe the target, the serviceMonitor uses labels to do this dynamically.

My problem is that every probe attempt fails.

The serviceMonitor log says the following:

    `level=info msg="Invalid HTTP response status code, wanted 2xx" status_code=400`

I'm pretty sure this happens because these are https endpoints, but if I add a "scheme: https" line to the serviceMonitor config it just doesn't work.

The Probe says the following:

    `level=error msg="Error for HTTP request" err="Get \"https://appIP:port/actuator/health\": tls: failed      to verify certificate: x509: certificate signed by unknown authority"`

So far I only tried making the probe work, I have no clue what to do with the serviceMonitor.

I tried giving the probe a service ca to work with, did not work. I gave it the cert and key used by the app and it did not work, says the same.

Any idea what I should do? Configs below.

You'll notice Probe config does not have a ca object right now, but it gave the same log.

I'd really appreciate if someone could help me sort this out, it's driving me crazy :D

(note: tlsConfig: insecureSkipVerify: true does not skip the verification process, which is weird)

Blackbox exporter yaml:

data:
  blackbox.yaml: |
    modules:
      http_2xx:
        http:
          no_follow_redirects: true
          method: GET
          preferred_ip_protocol: ip4
          valid_http_versions:
          - HTTP/1.1
          - HTTP/2
          valid_status_codes: []
          tls_config:
            insecure_skip_verify: true
        prober: http
        timeout: 10s

serviceMonitor yaml:

spec:
  endpoints:
    - interval: 30s
      params:
        module:
          - http_2xx
      path: /probe
      relabelings:
        - action: replace
          sourceLabels:
            - __address__
          targetLabel: __param_target
        - action: replace
          replacement: 'exporter:port'
          targetLabel: __address__
        - action: replace
          sourceLabels:
            - __param_target
          targetLabel: instance
        - action: labelmap
          regex: __meta_kubernetes_service_label_(.+)
      scrapeTimeout: 10s
  jobLabel: jobLabel
  selector:
    matchLabels:
      app.kubernetes.io/component: component

Probe yaml:

spec:
  interval: 30s
  module: http_2xx
  prober:
    path: /probe
    url: 'exporter.namespace.svc:port'
  targets:
    staticConfig:
      static:
        - 'https://app.namespace.svc:port/actuator/health'
  tlsConfig:
    cert:
      secret:
        key: key
        name: secret-name
    keySecret:
      key: key
      name: secret-name

Manually invoking blackbox exporter says this:

Logs for the probe:
ts=2023-12-07T10:24:46.576847865Z caller=main.go:181 module=http_2xx target=https://app.namespace.svc:port level=info msg="Beginning probe" probe=http timeout_seconds=119.5
ts=2023-12-07T10:24:46.576945405Z caller=http.go:328 module=http_2xx target=https://app.namespace.svc:port level=info msg="Resolving target address" target=app.namespace.svc ip_protocol=ip4
ts=2023-12-07T10:24:46.615450737Z caller=http.go:328 module=http_2xx target=https://app.namespace.svc:port level=info msg="Resolved target address" target=app.namespace.svc ip=IP_of_service
ts=2023-12-07T10:24:46.615543908Z caller=client.go:252 module=http_2xx target=https://app.namespace.svc:port level=info msg="Making HTTP request" url=https://IPaddress:port host=app.namespace.svc:port
ts=2023-12-07T10:24:46.624148963Z caller=handler.go:120 module=http_2xx target=https://app.namespace.svc:port level=error msg="Error for HTTP request" err="Get \"https://IPaddress:port\": tls: failed to verify certificate: x509: certificate signed by unknown authority"
ts=2023-12-07T10:24:46.624187979Z caller=handler.go:120 module=http_2xx target=https://app.namespace.svc:port level=info msg="Response timings for roundtrip" roundtrip=0 start=2023-12-07T10:24:46.618548821Z dnsDone=2023-12-07T10:24:46.618548821Z connectDone=2023-12-07T10:24:46.619955324Z gotConn=0001-01-01T00:00:00Z responseStart=0001-01-01T00:00:00Z tlsStart=2023-12-07T10:24:46.619998796Z tlsDone=2023-12-07T10:24:46.624134551Z end=0001-01-01T00:00:00Z
ts=2023-12-07T10:24:46.62420857Z caller=main.go:181 module=http_2xx target=https://app.namespace.svc:port level=error msg="Probe failed" duration_seconds=0.047321017

Upvotes: 1

Views: 2491

Answers (1)

user23328741
user23328741

Reputation: 1

I faced a similar issue:

"Get "https://IPaddress:port": tls: failed to verify certificate: x509: certificate signed by unknown authority"

I solved it by:

  1. Adding the root and intermediate certificates that were missing to the /etc/pki/ca-trust/source/anchors/ directory
  2. Running update-ca-trust
  3. Adding the below configuration for the certificate authority file to the blackbox.yml configuration file
    https_2xx:
        prober: http
        timeout: 5s
        http:
          valid_http_versions: ["HTTP/1.1", "HTTP/2.0"]
          valid_status_codes: []  # Defaults to 2xx
          method: GET
          fail_if_ssl: false
          fail_if_not_ssl: true
          preferred_ip_protocol: "ip4" # defaults to "ip6"
          ip_protocol_fallback: false  # no fallback to "ip6"
          tls_config:
            ca_file: /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem
    

Upvotes: 0

Related Questions