Ravi Singh Shekhawat
Ravi Singh Shekhawat

Reputation: 31

Avoid duplicate messages for topics across different clusters

We have a springboot application that has multiple pods deployed on k8s. It consumes events from a topic sitting in a cluster C1, transforms the messages and pushes the transformed data to a topic sitting in a different cluster using KafkaTemplate class.

Is it possible to maintain deduplication of events using Exactly once semantics config given that there are 2 clusters involved?

If not, what are the options available so even if a pod restarts or a pod gets added/deleted dynamically, there are no events getting duplicated?

Upvotes: 0

Views: 49

Answers (1)

OneCricketeer
OneCricketeer

Reputation: 192013

You will need to use an intermediate KV store, such as Redis, Mongo, Postgres, Elasticsearch, etc. Kafka itself will never know there's duplicates (yes, even compacted topics can contain duplicate keys), even in one cluster

You'd insert/query every event to your database to know if it's been seen before or not

Lookup 2PC patterns for more ideas around this concept

Depending on your use case, you could also use a framework like Temporal to handle such distribution transactions

Upvotes: 0

Related Questions