Udaya Prakash
Udaya Prakash

Reputation: 541

SQL Server CDC - Snapshot from specific date for Debezium Kafka

We are enabling CDC on specific tables in our MSSQL. We are connecting to a pipeline of migrating data through MSSQL->CDC->DEBEZIUM->KAFKA_CONNECT

There is a table that has more than a million rows, but we need only a few thousand rows from the table to be included in the Snapshot Created when enabling CDC. The reason why I don't want to handle it in our Kafka-Consumer is because, while I need just 1% of the data to be written to Mongo, rest 99% is gonna hit the consumer without any use.


Questions:

  1. Is it possible to create snapshot of specific rows/views while enabling CDC. I need rows which have column_value(modified date)>a specific date?
  2. Is this too much of micro-optimisation and I shall let everything come and hit the pipeline and be rejected by the consumer instead?

Upvotes: 2

Views: 1091

Answers (1)

Giorgos Myrianthous
Giorgos Myrianthous

Reputation: 39880

You can use Kafka Connect Single Message Transform (SMT). More precisely, you need the Filter SMT:

The filter.condition is a predicate specifying JSON path that is applied to each record processed, and when this predicate successfully matches the record is either included (when filter.type=include) or excluded (when filter.type=exclude)


In your case, you can include rows that satisfy your desired condition:

transforms=filter-records
transforms.filterExample1.type=io.confluent.connect.transforms.Filter$Value
transforms.filterExample1.filter.condition=$.value[?(@.modified_date > "1/1/2020")]
transforms.filterExample1.filter.type=include
transforms.filterExample1.missing.or.null.behavior=fail

Alternatively, you can decide which rows to exclude:

transforms=filter-records
transforms.filterExample1.type=io.confluent.connect.transforms.Filter$Value
transforms.filterExample1.filter.condition=$.value[?(@.modified_date <= "1/1/2020")]
transforms.filterExample1.filter.type=exclude
transforms.filterExample1.missing.or.null.behavior=fail

Upvotes: 3

Related Questions