Alfred
Alfred

Reputation: 21406

Is it possible to "upsert" a message in Kafka using Kafka Connect?

I am using Confluent 3.3.0. I am using jdbc-source-connector to insert messages into Kafka from my Oracle table. This works fine.
I would like to check if "upsert" is possible.

I mean, if I have a student table, having 3 columns id(number), name(varchar2), and last_modified(timestamp). Whenever I insert new row, it will be pushed to Kafka (using timestamp+auto increment fields). But when I update the row, corresponding message in Kafka should be updated.

The id of my table should become they key of corresponding Kafka message. My primary key (id) will remain constant as a reference.
The Timestamp field will get updated every time when the row is updated.

Is this possible? Or deleting existing record in Kafka and inserting the new one.

Upvotes: 0

Views: 4582

Answers (1)

OneCricketeer
OneCricketeer

Reputation: 191748

But when i update the row, corresponding message in Kafka should be updated

This isn't possible as Kafka is, by design, append-only, and immutable.

The best you would get is either querying all rows by some last_modified column, or hook in a CDC solution such as Oracle GoldenGate or the alpha Debezium solution that would capture the single UPDATE event on the database and append a brand new record on the Kafka topic.

If you want to de-dupe your database records in Kafka (find the message with the max last_modified within a window of time), you can use Kafka Streams or KSQL to perform that type of post-process filtering.

If you are using compacted Kafka topics, and inserting the your database key as the Kafka message key, then after compaction, then the latest appended message will persist, and the previous message with the same key will be dropped, not updated

Upvotes: 2

Related Questions