Kafka Streams: How to ensure offset is committed after processing is completed

Question

I want to process messages present in a Kafka topic using Kafka streams.

The last step of the processing is to put the result in a database table. To avoid database contention related issues(the program is going to run 24*7 and process millions of messages), I will be using batching for JDBC calls.

But in this case, there is a possibility of messages getting lost(in a scenario, I read 500 messages from a topic, streams will mark offset, now the program fails. Messages present in JDBC batch update are lost but the offset is marked for those messages).

I want to manually mark the offset of the last message once the database insert/update is complete, but it is not possible according to the following question: How to commit manually with Kafka Stream?.

Can someone please suggest any possible solution

Nitin · Accepted Answer

Kafka Stream doesn't support manual commit, and at the same time it doesn't support batch processing as well. With respect to your use case, there are few possibilities:

Use Normal consumer and implement batch processing and control manual offset.
Use Spark Kafka Structured stream as per below Kafka Spark Structured Stream
Try Spring Kafka [Spring Kafka]2
In this kind of scenario there are possibilities to consider JDBC Kafka Connector as well. Kafka JDBC Connector

Kafka Streams: How to ensure offset is committed after processing is completed

Answers (2)

Related Questions