Nandish Kotadia
Nandish Kotadia

Reputation: 461

Kafka Stream reprocessing old messages on rebalancing

I have a Kafka Streams application which reads data from a few topics, joins the data and writes it to another topic.

This is the configuration of my Kafka cluster:

5 Kafka brokers
Kafka topics - 15 partitions and replication factor 3. 

My Kafka Streams applications are running on the same machines as my Kafka broker.

A few million records are consumed/produced per hour. Whenever I take a broker down, the application goes into rebalancing state and after rebalancing many times it starts consuming very old messages.

Note: When the Kafka Streams application was running fine, its consumer lag was almost 0. But after rebalancing, its lag went from 0 to 10million.

Can this be because of offset.retention.minutes.

This is the log and offset retention policy configuration of my Kafka broker:

log retention policy : 3 days
offset.retention.minutes : 1 day

In the below link I read that this could be the cause:

Offset Retention Minutes reference

Any help in this would be appreciated.

Upvotes: 1

Views: 1933

Answers (1)

Related Questions