eddyP23
eddyP23

Reputation: 6863

Kafka - why new topic partition leader is not elected?

I have a kafka cluster of 3 nodes. When Node #3 dies, my _schemas topic stops functioning properly and I see this:

kafka-topics --zookeeper localhost:2181 --topic _schemas --describe
Topic:_schemas  PartitionCount:1        ReplicationFactor:2     Configs:cleanup.policy=compact
    Topic: _schemas Partition: 0    Leader: -1      Replicas: 3,2   Isr: 2

So it seems that node #3 is dead and that is what Leader: -1 refers to. But why doesn't Kafka just continue working as usual, assigning Node #2 as the new leader and replicating the data to #1 so that we have 2 in sync replicas?

The error I see in the kafka logs:

kafka.common.NotAssignedReplicaException:
Leader 3 failed to record follower 2's position -1 since the replica is not 
recognized to be one of the assigned replicas 3 for partition <loop over many partitions>

Upvotes: 12

Views: 17438

Answers (2)

linehrr
linehrr

Reputation: 1748

I solved this problem by restarting the controller broker. every kafka cluster has a broker been elected to controller so it will coordinate leader election. our case is that controller stuck. in order to find which broker is controller, you can just go to your zkCli.sh to access into your zookeeper which your kafka cluster uses, and then do get /controller, you will see brokerId there. I fixed this easily by restarting the controller broker, good luck.

Upvotes: 4

Gal Shaboodi
Gal Shaboodi

Reputation: 764

If you have a cluster of 3 kafka brokers, and you have only 1 partition for your topic, it means that you have only one leader, and you are producing data and working only against this broker.

If you want kafka to:

continue working as usual, assigning Node #2 as the new leader

You should create your topic with 3 partitions, each broker will be leader in other partitions, and if one of the brokers will fail down you'll be able to write to other partitions.

see example of running ./kafka-topics.sh --zookeeper localhost:2181 --topic _schemas --describe:

Topic:_schemas    PartitionCount:3    ReplicationFactor:1 Configs:retention.ms=14400000
Topic: _schemas   Partition: 0    Leader: 2   Replicas: 2 Isr: 2
Topic: _schemas   Partition: 1    Leader: 0   Replicas: 0 Isr: 0
Topic: _schemas   Partition: 2    Leader: 1   Replicas: 1 Isr: 1

In this example you can see that _schemas has 3 partitions meaning all 3 brokers are leaders of that topic each on different partition, s.t broker 2 is leader on partition 0 and broker 0 and broker 1 are followers on partition 0.

Upvotes: -1

Related Questions