Reputation: 31
We have a springboot application that has multiple pods deployed on k8s. It consumes events from a topic sitting in a cluster C1, transforms the messages and pushes the transformed data to a topic sitting in a different cluster using KafkaTemplate class.
Is it possible to maintain deduplication of events using Exactly once semantics config given that there are 2 clusters involved?
If not, what are the options available so even if a pod restarts or a pod gets added/deleted dynamically, there are no events getting duplicated?
Upvotes: 0
Views: 49
Reputation: 192013
You will need to use an intermediate KV store, such as Redis, Mongo, Postgres, Elasticsearch, etc. Kafka itself will never know there's duplicates (yes, even compacted topics can contain duplicate keys), even in one cluster
You'd insert/query every event to your database to know if it's been seen before or not
Lookup 2PC patterns for more ideas around this concept
Depending on your use case, you could also use a framework like Temporal to handle such distribution transactions
Upvotes: 0