Reputation: 3898
I wish to avoid sending duplicate messages to a Kafka topic.
What is the ideal way to achieve it ?
Using Java client for Apache Kafka, is there anyway to verify if a message exists before invoking KafkaProducer.send
I am referring to this doc
Upvotes: 1
Views: 3800
Reputation: 62350
Currently (Kafka 0.10.1
), there is no way to have exactly-once guarantees on write with Kafka. No matter what workaround you want to do, there will be always be a gap and you can end up with either lost messages or duplicates.
However, Kafka will add an idempotent producer (planned for 0.10.2
) that will allow you to avoid duplicate writes. The target date for 0.10.2
release is beginning 2017.
Upvotes: 2
Reputation: 38245
That's pretty much out of scope for Kafka. You need to do that using a different storage that provides proper indexing for random access. Depending on your needs, that can be (distributed) cache, a key-value store or whatever.
You'll probably want to do that on the consumer-side rather than producer, as different consumers may use different strategies for de-duplication (and some consumers may simply tolerate duplicates).
Upvotes: 0
Reputation: 7091
It is impractical for you to check whether the same message has been delivered every time you send a new one. Think it another way: you could invoke KafkaProducer.send method with a callback notifying you of the success or failure.
Upvotes: 0