Shabirmean
Shabirmean

Reputation: 2571

Reset to custom offset in Kafka partition

I am researching Kafka for a specific use case I am working on. I have a stream of data that is flowing and I want to process it and publish it to intermediary stages.

At each of these stages (initial and intermediary) Samza tasks would do the processing and re publishing. One of the requirements I have is for me to be able to re-trigger the whole processing pipeline from a specific stage in time whenever I want.

I know that kafka maintains an offset for each of its logs (incoming data). However, does Kafka provide any functionality with which I can map partition offsets to some custom identifier (say timestamp) and use this to re-trigger the whole pipeline from that point on wards?

I have read in multiple places that I can replay the kafka commit log by resetting it the beginning and also going back some N times. But is there a way for me to map these offsets to my own identifier like time stamps and use it as a mechanism to tell from which offset to replay.

Best
Shabir

Upvotes: 0

Views: 1713

Answers (1)

Natalia
Natalia

Reputation: 4532

you can use commandline tool kafka-consumer-groups to reset offset for consumer group based on timestamp (--to-datetime). See more on the doc page: https://kafka.apache.org/documentation/#basic_ops_consumer_group

The same, of course, can be achieved through the code.

Upvotes: 2

Related Questions