user3611168
user3611168

Reputation: 385

Kafka monitoring in cluster environment

I have an kafka cluster (3 machine with 1 zookeeper and 1 broker run on each machine) I am using kafka_exporter to monitoring consumer lag metric, it's work fine in normal case. But, when i kill 1 broker, the Prometheus cannot get metric from http://machine1:9308/metric (kafka_exporter metric endpoint), because it take a long time to get data (1,5m), so it will be timeout. Now, if I restart kafka_exporter I will see some error:

Cannot get leader of topic __consumer_offsets partition 20: kafka server: In the middle of a leadership election, there is currently no leader for this partition and hence it is unavailable for writes

When I run the command: kafka-topics.bat --describe --zookeeper machine1:2181,machine2:2181,machine3:2181 --topic __consumer_offsets The result are:

Topic:__consumer_offsets        PartitionCount:50       ReplicationFactor:1     Configs:compression.type=producer,cleanup.policy=compact,segment.bytes=104857600
Topic: __consumer_offsets       Partition: 0    Leader: -1      Replicas: 1     Isr: 1
Topic: __consumer_offsets       Partition: 1    Leader: 2       Replicas: 2     Isr: 2

Topic: __consumer_offsets       Partition: 49   Leader: 2       Replicas: 2     Isr: 2

Is this a configuration error? And how can I get the consumer lag in this case? The "Leader: -1" is an error? if I shutdown the machine 1 forever, it's still work fine?

Upvotes: 2

Views: 1670

Answers (1)

b1tchacked
b1tchacked

Reputation: 498

The leader being -1 means that there is no other broker in the cluster that has a copy of the data for the partition.

The problem in your case is that the replication factor for your topic __consumer_offsets is 1, which means that there is only one broker that hosts the data of any partition in the topic. If you lose any one of the brokers, all the partitions on the broker become unavailable resulting in the topic becoming unavailable. So, your kafka_exporter will fail to read from this topic.

The fix to this if you want to continue exporting consumer offsets on a broker loss, is to reconfigure the topic __consumer_offsets to have replication factor more than 1.

Advised Config - Replication factor - 3, min.insync.replicas - 2.

Upvotes: 1

Related Questions