charany1
charany1

Reputation: 911

Kafa : How does Kafka stores and retrieves offset for each consumer-group?

I'm not exactly looking for API to accomplish this rather internal implementation detail.

I know that recent versions of Kafka stores offsets for consumer-group in a special Kafka topic __consumer_offset.

My questions are :

What exactly is the data structure residing in this topic ?

When a conumer-group dies and comes up how does Kafka look-up for the offset in Topic-Partitions till which that consumer-group had consumed last time?

As far as my understanding is , Kafka topics are not suited for looking-up data : for examples : for queries like :

Select *offset* from __consumer_offset where consumer-group-name=*consumer-group* and topic=*topic-1*

Basically , I want to know the internal details of __consumer_offset or anything utilized for consumer offset management.

I read this wiki page https://cwiki.apache.org/confluence/display/KAFKA/Offset+Management , but couldn't understand the in-memory data structure part.

Upvotes: 3

Views: 871

Answers (1)

hoodakaushal
hoodakaushal

Reputation: 1293

Every consumer group is assigned a particular partition in the __consumer_offsets topic based on it's hash.

Then, offsets are simply written as messages to the __consumer_offsets topic.

To keep this topic from growing too large, periodically older offsets of a given consumer group are deleted.

For reads, the Kafka broker loads this data into memory as part of startup so that every request for offset doesn't cause disk I/O. Since only the latest offset is accessed frequently, in normal operation this doesn't amount to much data to be kept in memory.

Upvotes: 3

Related Questions