user3877158
user3877158

Reputation: 11

Apache Kafka Cleanup while consuming messages

Playing around with Apache Kafka and its retention mechanism I'm thinking about following situation:

As you can see the consumer lost the offsets 6-10.

Question, is such a situation possible at all? With other words, will the cleaner execute while there is an active consumer? If yes, is the consumer able to somehow recognize that gap?

Upvotes: 0

Views: 799

Answers (2)

Vanitha Kumar
Vanitha Kumar

Reputation: 196

  • Question, is such a situation possible at all? will the cleaner execute while there is an active consumer Yes, if the messages have crossed TTL (Time to live) period before they are consumed, this situation is possible.
  • Is the consumer able to somehow recognize that gap? In case where you suspect your configuration (high consumer lag, low TTL) might lead to this, the consumer should track offsets. kafka-consumer-groups.sh command gives you the information position of all consumers in a consumer group as well as how far behind the end of the log they are.

Upvotes: 0

Mickael Maison
Mickael Maison

Reputation: 26950

Yes such a scenario can happen. The exact steps will be a bit different:

  • Consumer fetches message 1-5
  • Messages 1-10 are deleted
  • Consumer tries to fetch message 6 but this offset is out of range
  • Consumer uses its offset reset policy auto.offset.reset to find a new valid offset.
    • If set to latest, the consumer moves to the end of the partition
    • If set to earliest the consumer moves to offset 11
    • If none or unset, the consumer throws an exception

To avoid such scenarios, you should monitor the lead of your consumer group. It's similar to the lag, but the lead indicates how far from the start of the partition the consumer is. Being near the start has the risk of messages being deleted before they are consumed.

If consumers are near the limits, you can dynamically add more consumers or increase the topic retention size/time if needed.

Setting auto.offset.reset to none will throw an exception if this happens, the other values only log it.

Upvotes: 1

Related Questions