Reputation: 11
I have a requirement to produce data from multiple MongoDB tables and push to the same Kafka Topic using the mongo-kafka connector. Also I have to ensure that the data for the same table key columns always go to the same partition every time to ensure message ordering. For example :
tables --> customer , address
table key columns -->CustomerID(for table customer) ,AddressID(for table address)
For CustomerID =12345 , it will always go to partition 1
For AddressID = 54321 , it will always go to partition 2
For a single table , the second requirement is easy to achieve using chained transformations. However for multiple tables->1 topic , finding it difficult to achieve since each of these tables has different key column names.
Is there any way available to fulfil both requirements using the Kafka connector?
Upvotes: 1
Views: 744
Reputation: 191894
If you use ExtractField$Key
transform and IntegerConverter
, all matching IDs should go to the same partition.
If you have two columns and one table, or end up with keys like {"CustomerID": 12345}
then you have a composite/object key, meaning the whole key will be hashed when used to compute partitioning, not the ID itself.
You cannot set partition for specific fields within any record without setting producer.override.partitioner.class
in Connector config. In other words, you need to implement a partitioner that will deserialize your data, parse the values, then compute and return the respective partition.
Upvotes: 0