Rednam Nagendra
Rednam Nagendra

Reputation: 363

Kafka Stream program is reprocessing the already processed events

I forwarded few events to Kafka and started my Kafka stream program. My program started processing the events and completed. After some time I stopped my Kafka stream application and I started again. Observed that My Kafka stream program is processing the already processed previous events.

As per my understanding, Kafka stream internally maintains the offset for input topics itself per application id. But here reprocessing the already processed events.

How to verify up to which offset Kafka stream processing was done? How Kafka stream persisted these bookmarks? On what basis & from which Kafka offset, Kafka stream will start read the events from Kafka?

If Kafka steam throws exceptions then is it reprocessed already processed events?

Please clarify my doubts.

Please help me to under stand more.

Upvotes: 1

Views: 982

Answers (1)

Matthias J. Sax
Matthias J. Sax

Reputation: 62285

Kafka Streams internally uses a KafkaConsumer and all running instances form a consumer group using application.id as group.id. Offsets are committed to the Kafka cluster in regular intervals (configurable). Thus, on restart with the same application.id Kafka Streams should pick up the latest committed offset and continue processing from there.

You can check committed offset as for any other consumer group using bin/kafka-consumer-groups.sh tool.

Upvotes: 1

Related Questions