Reputation: 33
As I read from Debezium's FAQs, it is said that:
Most connectors will record all events for a single database table to a single topic. Additionally, all events within a topic are totally-ordered, meaning that the order of all of those events will be maintained.
How are events for a database organized?
However, AFAIK Apache Kafka only has ordering guarantees within a single partition. So if I expect the events in a topic to be ordered, I have to set that topic having only one partition, which sacrifices the throughput of Kafka, otherwise with other mechanism. But I didn't see any explanation about this in Debezium's documentation.
My question is, how does Debezium implement the ordering guarantees within one topic? Or which module of the source code should I study to find out the detailed implementation of this feature?
Upvotes: 3
Views: 836
Reputation: 770
to quote the answer here:
... Kafka Connect’s producer will use the default partitioning logic that computes the partition using a consistent hash of the message key, which in Debezium’s case is a struct containing the affected row’s primary/unique key...
So if the concern is that the same row/document should not be read out of order, then the concern is ruled out because the PK will always send the the event to the same partition
Upvotes: 1