LetsPlayYahtzee
LetsPlayYahtzee

Reputation: 7581

Kafka reset tool Consumer offset not resetting to zero

I am trying to understand some fundamental Kafka concepts so that I can properly monitor the progress of my KafkaStreams based application.

Specifically for debugging purposes I need to be able to have my application re-consume a whole topic. For that I used the reset tool.

After executing the script looking into the Kafka Manager for some inputs topics I see that the Consumer Offset has decreased and the Lag has increased (which makes sense). Although the Consumer Offset is not going to zero. I am trying to interpret that but I haven't found a concrete explanation of what the Consumer Offset and Logsize in Kafka Manager are referring to.

To make it fit what I see I assume that the Logsize is the total amount of messages placed into the topic since it's beginning but not necessarily the amount of messages currently in the topic. As some may have been thrown away due their age exceeding the retention period. Am I right?

If not, then what is the explanation behind the fact that after running the reset tool for some input topics I observe that the Consumer Offset is equal to the Logsize (and not zero) and Lag is zero?

Upvotes: 1

Views: 2976

Answers (1)

Matthias J. Sax
Matthias J. Sax

Reputation: 62350

I am not familiar with yahoo-kafka-manager, however, you can also use bin/kafka-consumer-groups.sh (a tool shipped with Kafka itself). There LOG-END-OFFSET means what you describe. From a naming perspective it's unclear to me if Logsize is the same as "log end offset" or the difference between highest and lowest offset in a partition.

After executing the script looking into the Kafka Manager for some inputs topics I see that the Consumer Offset has decreased and the Lag has increased.

This makes sense -- as "lag" is difference of "log end offset" and "committed offset" the lag should be increased after resetting you applications. However, I am not sure why committed consumer group offset is not zero (can you very what you observe using bin/kafka-consumer-group.sh -- maybe yahoo-kafka-manager report something different).

Update: however the tool will not set the offset to zero but to "beginning of log". (The docs are not correct.)

Also note, that auto.offset.reset strategy might tick in after you reset your applications and restart it ([committed] offset zero might not be valid if log got truncated). Could this explain the behavior you observe?

This blog post might also help to understand further details: https://www.confluent.io/blog/data-reprocessing-with-kafka-streams-resetting-a-streams-application/

Upvotes: 2

Related Questions