Reputation: 3670
Earlier kafka used to store consumer offsets in zookeeper, but since kafka 0.10 or 0.11 - kafka started to store consumer offsets in an internal topic.
As stated in this post -
Kafka brokers use an internal topic named __consumer_offsets that keeps track of what messages a given consumer group last successfully processed. As we know, each message in a Kafka topic has a partition ID and an offset ID attached to it.
But a topic is not like a DB Table - which can be queried for data based on some input. So I am wondering how this is efficient at all and how exactly does kafka retrieve the offsets for a particular partiton for a particular consumer-group.
Upvotes: 0
Views: 546
Reputation: 191725
Kafka Streams or an in-memory hashtable can make compacted topics very much like an KV database store.
The consumer offsets topic is a compacted topic, keyed by group name. When you give a group.id
in the client, the Controller node and Group Coordinator are easily able to lookup that name from the topic, by key, and return all currently committed offsets for all partitions for the group. Then the consumer looks up the offsets for its assigned partitions from the returned map.
It's not a question of "better". Removing dependencies of Zookeeper was always the goal, and is finally production ready as of Kafka 3.3.1.
Upvotes: 1