Aarkan
Aarkan

Reputation: 4109

Managing current offset in kafka

I could not find any information in documentation about how kafka manages the current offset for a consumer. I guess consumer by default keeps the last offset read in memory and commits that either on an explicit call to commitSync or commitAsync or as per enable.auto.commit policy. Is this correct or am I missing something? If someone can point to a documentation or some reference to this aspect of offset management, it will be highly appreciated.

Thanks in advance.

Upvotes: 1

Views: 792

Answers (1)

Giorgos Myrianthous
Giorgos Myrianthous

Reputation: 39950

A consumer group, is a set of consumers which are coordinated for consuming messages from topic(s). Now one of your Kafka brokers acts as the group coordinator which is responsible for coordinating all consumers belonging to that group.

Depending on the configuration of enable.auto.commit and they way you're handling offset management in your code, the will be committed and stored in a topic called __consumer_offsets.


If enable.auto.commit is set to True then the consumer's offset are periodically committed in the background. On the other hand, commitSync() and commitAsync() are blocking and non-blocking calls respectively that allow committing offsets manually. If you use one of commitSync() or commitAsync() it is recommended to set enable.auto.commit to False.

In the rare case where your policy enable.auto.commit is set to True but at the same time you make use of one of commitSync() or commitAsync() then offsets will be committed in both cases:

  • Every time you call commitSync() or commitAsync()
  • Every N ms where N is a configurable parameter (auto.commit.interval.ms)

When enable.auto.commit is set to true, then the largest offset is committed every auto.commit.interval.ms of time. However, this happens only whenever poll() is called. In every poll (max.poll.interval.ms), the enable.auto.commit is checked. Whenever you poll(), the consumer checks if it is time to commit the offset it returned in the last poll.

For more details, you can refer to Confluent's documentation for Offset Management.

Upvotes: 2

Related Questions