Reputation: 21406
I am using Confluent 3.3.0. I am using jdbc-source-connector
to insert messages into Kafka from my Oracle table. This works fine.
I would like to check if "upsert" is possible.
I mean, if I have a student table, having 3 columns id
(number), name
(varchar2), and last_modified
(timestamp). Whenever I insert new row, it will be pushed to Kafka (using timestamp+auto increment fields). But when I update the row, corresponding message in Kafka should be updated.
The id
of my table should become they key
of corresponding Kafka message. My primary key (id) will remain constant as a reference.
The Timestamp field will get updated every time when the row is updated.
Is this possible? Or deleting existing record in Kafka and inserting the new one.
Upvotes: 0
Views: 4582
Reputation: 191748
But when i update the row, corresponding message in Kafka should be updated
This isn't possible as Kafka is, by design, append-only, and immutable.
The best you would get is either querying all rows by some last_modified
column, or hook in a CDC solution such as Oracle GoldenGate or the alpha Debezium solution that would capture the single UPDATE event on the database and append a brand new record on the Kafka topic.
If you want to de-dupe your database records in Kafka (find the message with the max last_modified
within a window of time), you can use Kafka Streams or KSQL to perform that type of post-process filtering.
If you are using compacted Kafka topics, and inserting the your database key as the Kafka message key, then after compaction, then the latest appended message will persist, and the previous message with the same key will be dropped, not updated
Upvotes: 2