Reputation: 159
hi i am new to debezium
i am planning to do a realtime database(mysql and mongo) integration by using debezium from each database i need to sync the data to a destination database(mysql and mongo) from mysql and mongo i need only X number of tables and y number of collections respectively from each database
in those x number of tabels and y number of collection i need only the specific set of data based on a condtion for each table and collection
here the conditon is not straight forward i meant for example if we take the mysql database i need to take records by joining a table(the table that i want to capture cdc) with one specific table and capture the matching records only
above mentioned is my requirement i have some question regarding the requirements
as per the debezium documentation based on each table its creating a topic, for each topic if we push the cdc data into a specific partition for that topic the order is gurantateed so when working with multiple tables it will push the cdc data to each table's topic's partition in this case how can i achieve the order of events between multiple tables i mean i need the exact order of events that performed on mysql binlog is the order is guranteeed when working with multiple tables because i need to do the sync on the destination database in the same order that happened on source database's binlog??
if i want the data based on a mysql or mongo join condition with a table or collection how can i achieve this from debezium
these are my two main questions please help me on this
Thanks
Mike
Upvotes: 2
Views: 1397
Reputation: 159
since 15 days i posted this question so far i didn't received any answer so i did the test myself and found the answer for the ordering i did the testing for debezium mysql connector
i have tested with the transaction wich has 7 operation
1.tableA insert
2.tableA update
3.tableB insert
4.tableA update
5.tableB insert
6.tableA update
7.tableC insert
i went with the default connector configuration and observed the changes for 5 times i could see the order was not in the order which i performed so i made a configuration change to set the default topic partition to 1 then i could observe that each topic in the data was ordered but the whole order was not there then when i managed to achieve it by having a single topic and a single partition then the trasaction order was guaranteed
this is for single partition
"topic.creation.default.replication.factor": 1,
"topic.creation.default.partitions": 1,
this is for single topic
"transforms":"dropPrefix",
"transforms.dropPrefix.type":"org.apache.kafka.connect.transforms.RegexRouter",
"transforms.dropPrefix.regex": "(.*)([^0-9])(.*)",
"transforms.dropPrefix.replacement": "yourtopic",
Note here i have tested with kafka version 2.7.0 i was facing issue with 2.6.0
and i have created a library in golang which will be useful to extract needed data
https://github.com/ahamedmulaffer/cdc-formatter
Hope its help someone
Thanks
Mike
Upvotes: 3