Reputation: 43
I'm trying to expose the built-in metrics of Flink to Prometheus, but somehow Prometheus doesn't recognize the targets - both the JMX as well as the PrometheusReporter.
The scraping defined in prometheus.yml
looks like this:
scrape_configs:
- job_name: node
static_configs:
- targets: ['localhost:9100']
- job_name: 'kafka-server'
static_configs:
- targets: ['localhost:7071']
- job_name: 'flink-jmx'
static_configs:
- targets: ['localhost:8789']
- job_name: 'flink-prom'
static_configs:
- targets: ['localhost:9249']
And my flink-conf.yml
has the following lines:
#metrics.reporters: jmx, prom
metrics.reporters: jmx, prometheus
#metrics.reporter.jmx.factory.class: org.apache.flink.metrics.jmx.JMXReporterFactory
metrics.reporter.jmx.class: org.apache.flink.metrics.jmx.JMXReporter
metrics.reporter.jmx.port: 8789
metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter
metrics.reporter.prom.port: 9249
However, both Flink targets are down when running a WordCount
java -jar target/flink-word-count.jar --input src/main/resources/loremipsum.txt
flink run target/flink-word-count.jar --input src/main/resources/loremipsum.txt
According to the Flink docs I don't need any additional dependencies for JMX and a copy of the provided flink-metrics-prometheus-1.10.0.jar
in flink/lib/
for the Prometheus reporter.
What am I doing wrong? What is missing?
Upvotes: 0
Views: 3873
Reputation: 43439
That particular job is going to run to completion pretty quickly, I believe. Once you get the setup working there may be no interesting metrics because the job doesn't run long enough for anything to show up.
When you run with a mini-cluster (as java -jar ...
), the flink-conf.yaml
file isn't loaded (unless you've done something rather special in your job to get it loaded). Note also that this file is normally has a .yaml
extension; I'm not sure if it works if .yml
is used instead.
You can check the jog manager and task manager logs to make sure that the reporters are being loaded.
FWIW, the last time I did this I used this setup, so that I could scrape from multiple processes:
# flink-conf.yaml
metrics.reporters: prom
metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter
metrics.reporter.prom.port: 9250-9260
# prometheus.yml
scrape_configs:
- job_name: 'flink'
static_configs:
- targets: ['localhost:9250', 'localhost:9251']
Upvotes: 2