Reputation: 11
I have several instances of the same service subscribed to a Kafka topic. A producer publishes 1 message to a topic. I want this message to be consumed by all instances. When instance is started, the messages should be read from the end of topic/partitions. I don't want the instances to receive messages which were published before service is started (but these won't be a big problem if some old messages are processed by the service). I don't want the instances to lose messages if the instances are disconnected from Kafka for some time or Kafka is down which mean that I need to commit offsets periodically. Message can be processed twice, it is not a big problem.
Is the following the best way to archive the described behavior: generate new Kafka group id using new Guid or timestamp for each instance each time instance is started?
What are disadvantages of the approach described in item 1 above?
Upvotes: 0
Views: 403
Reputation: 1561
It is enough to do two things. First, each instance of the service should have its own group.id
. That guarantees that each of them will read all published messages, and will receive published messages after reconnecting. This id is per instance and there is no need to regenerate it on start. Second, each instance should have the property auto.offset.reset=latest
, which is also the default. This guarantees that the consumer will not read messages, which were published before the first start of the instance.
Of course, your instances need to commit offsets after processing of the messages.
Upvotes: 2