Reputation: 123
I have the following Kafka Setup
Number of producer : 1
Number of topics : 1
Number of partitions : 2
Number of consumers : 3 (with same group id)
Number of Kafka cluster : none(single Kafka server)
Zookeeper.session.timeout : 1000
Consumer Type : High Level Consumer
Producer produces messages without any specific partitioning logic(default partitioning logic). Consumer 1 consumes message continuously. I am abruptly killing consumer 1 and I would except consumer 2 or consumer 3 to consume the messages after the failure of consumer 1.
In some cases rebalance occurs and consumer 2 starts consuming messages. This is perfectly fine. But in some cases either consumer 2 or consumer 3 is not at all consuming. I have to manually kill all the consumers and start all three consumers again. Only after this restart consumer 1 starts consuming again.
Precisely rebalance is successful in some cases while in some cases rebalance is not successful. Is there any configuration that I am missing.
Upvotes: 1
Views: 1798
Reputation: 2938
Kafka uses Zookeeper to coordinate high level consumers.
From http://kafka.apache.org/documentation.html :
Partition Owner registry
Each broker partition is consumed by a single consumer within a given consumer group. The consumer must establish its ownership of a given partition before any consumption can begin. To establish its ownership, a consumer writes its own id in an ephemeral node under the particular broker partition it is claiming.
/consumers/[group_id]/owners/[topic]/[broker_id-partition_id] --> consumer_node_id (ephemeral node)
There is a known ephemeral nodes quirk that they can linger up to 30 seconds after ZK client suddenly goes down : http://developers.blog.box.com/2012/04/10/a-gotcha-when-using-zookeeper-ephemeral-nodes/
So you may be running into this if you expect consumer 2 and 3 to start reading messages immediately after #1 is terminated.
You can also check that /consumers/[group_id]/owners/[topic]/[broker_id-partition_id] contains correct data after rebalancing.
Upvotes: 2