HamoriZ
HamoriZ

Reputation: 2438

Aggregate Kubernetes liveness probe responses

My application has a /health Http endpoint, configured for Kubernetes liveness check probe. The API returns a json, containing the health indicators.

Kubernetes only cares about the returned http status, but I would like to store the json responses in Prometheus for monitoring purposes.

Is it possible to catch the response once Kubernetes calls the API? I do not want to add the feature to the application itself but use an external component.

What is the recommended way of doing it?

Upvotes: 1

Views: 1341

Answers (1)

Max Lobur
Max Lobur

Reputation: 6040

Answering to what you've asked:

  1. Make a sidecar that calls localhost:port/health every N seconds and stores the most recent reply. N should be equal to the prometheus scraping interval for accurate results.

  2. A sidecar then exposes the most recent reply in the form of metric in /metrics endpoint, on a separate port of a pod. You could use https://github.com/prometheus/client_python to implement the sidecar. Prometheus exporter sidecar is actually a widely used pattern, try searching for it.

  3. Point Prometheus to service /metrics endpoint, which is now served by a sidecar, on a separate port. You will need a separate port in a Service object, to point to your sidecar port. The scraping interval can be adjusted at this stage, to be in sync with your N, or otherwise, just adjust N. For scrape_config details refer to: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#service

    If you need an automatic Prometheus target discovery, say you have a bunch of deployments like this and their number varies - refer to my recent answer: https://stackoverflow.com/a/64269434/923620

Proposing a simpler solution:

  1. In your app ensure you log everything you return in /health
  2. Implement centralized logging (log aggregation): https://kubernetes.io/docs/concepts/cluster-administration/logging/
  3. Use a log processor, e.g. ELK, to query/analyze the results.

Upvotes: 3

Related Questions