Reputation: 7468
I have following code in my flink job;
@Override
public void open(Configuration config) {
this.counter = getRuntimeContext()
.getMetricGroup()
.counter("myCounter");
}
@Override
public Tuple2<String, String> map(String s) throws Exception {
this.counter.inc();
Thread.sleep(5000);
return new Tuple2<String, String>(s, s.toUpperCase());
}
In prometheus.yml inside prometheus distribution, I have following:
- job_name: 'flink-prometheus'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9999']
metrics_path: /
And in flink-conf.yaml inside flink distribution:
metrics.reporters: prom
metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter
metrics.reporter.prom.host: 127.0.0.1
metrics.reporter.prom.port: 9999
On prometheus board, I can see localhost:9999 as target, and also various metric logs. But there is no log for the counter I have added in the code. I searched for string "myCounter" as well as "flink-prometheus", but zero results.
What else I need to do for my metrics to show up?
Upvotes: 2
Views: 1760
Reputation: 43439
The main difference I see between the example in https://github.com/mbode/flink-prometheus-example and your own config is that the example is scraping the job manager as well as the task manager(s):
scrape_configs:
- job_name: 'flink'
static_configs:
- targets: ['job-cluster:9249', 'taskmanager1:9249', 'taskmanager2:9249']
In my own example -- see Flink Timing Explorer -- I found it necessary to do this as well. Here's what worked for me:
flink-conf.yaml
metrics.reporters: prom
metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter
metrics.reporter.prom.port: 9250-9260
prometheus.yaml
global:
scrape_interval: 5s
evaluation_interval: 5s
scrape_configs:
- job_name: 'flink'
static_configs:
- targets: ['host.docker.internal:9250', 'host.docker.internal:9251']
Upvotes: 1