Reputation: 304
In our application we need to publish events from the certain postgresql table into Kafka, so we decided to use Debezium but ran into the following problem: during initial snapshot messages show up in Kafka in an unexpected (from our point of view) order. Order of events is crucial to our application, in fact they must be orderd by the integer primary key of the table. AFAIK inital snapshot is just a SELECT from the table without an ORDER BY. So is there a way or workaround to make Debezuim postgresql connector extract events in a certain order?
Thanks in advance!
Upvotes: 4
Views: 2662
Reputation: 19010
Check out the snapshot.select.statement.overrides
property in the connector docs. It lets you customize the SELECT statement used for specific tables, so you can append the required ORDER BY clause.
Upvotes: 7
Reputation: 247445
There is no guarantee that the order of the inserts in the transaction log is in the same order as the automatically generated primary keys. With high concurrency, a different order would be quite normal.
If the transactions are short, the order should not be too mixed up.
Anyway, there is nothing you can do about this on the PostgreSQL side.
Upvotes: 1