TestUser
TestUser

Reputation: 11

How to ensure that in one Kafka topic same key goes to same partition for multiple tables

I have a requirement to produce data from multiple MongoDB tables and push to the same Kafka Topic using the mongo-kafka connector. Also I have to ensure that the data for the same table key columns always go to the same partition every time to ensure message ordering. For example :

tables --> customer , address

table key columns -->CustomerID(for table customer) ,AddressID(for table address)

For CustomerID =12345 , it will always go to partition 1

For AddressID = 54321 , it will always go to partition 2

For a single table , the second requirement is easy to achieve using chained transformations. However for multiple tables->1 topic , finding it difficult to achieve since each of these tables has different key column names.

Is there any way available to fulfil both requirements using the Kafka connector?

Upvotes: 1

Views: 744

Answers (1)

OneCricketeer
OneCricketeer

Reputation: 191894

If you use ExtractField$Key transform and IntegerConverter, all matching IDs should go to the same partition.

If you have two columns and one table, or end up with keys like {"CustomerID": 12345} then you have a composite/object key, meaning the whole key will be hashed when used to compute partitioning, not the ID itself.

You cannot set partition for specific fields within any record without setting producer.override.partitioner.class in Connector config. In other words, you need to implement a partitioner that will deserialize your data, parse the values, then compute and return the respective partition.

Upvotes: 0

Related Questions