deGee
deGee

Reputation: 801

Incremental CDC from Postgresql for a subset of rows

I have a postgresql DB which has tables containing multi-tenant data. Example:

Table Customer

CustId Cust Name
First CustA
Second CustB

Table CustomerShoppingDet

Item Name CustId. Buying Date
First CustA 02-13-2023
Second CustA 03-14-2023
First. CustB 04-14-2023

These are all on AWS RDS scattered across regions. Lets assume, CustA probably exists only in us-east region. Now I have a use case wherein I need to move CustA from us-east (say - source) to eu-central region RDS (say - destination). So, I need to migrate all CustA related data.

The data per customer is quite large and spans across multiple tables (~18GB per customer). We are thinking of replicating a snapshot as Phase I and using CDC I will sync all delta updates from source to destination until the D-Day when the switch happens from source to destination. We will disable CustA on source for a brief time before enabling it at the destination.

This is primarily for 2 reasons:

Any suggestion on a suitable CDC replication mechanism that I can use here. I have gone through Debezium and AWS DMS offering but I am not completely sure if they will allow filtering on only CustA records.

I can even build my own export, transform and load pipeline if that can offer me greater flexibility.

Upvotes: 0

Views: 139

Answers (1)

Laurenz Albe
Laurenz Albe

Reputation: 247380

You can use logical decoding in PostgreSQL v15 or better, where CREATE PUBLICATION supports a WHERE condition. You need to define a publication for each tenant. Then the subscriber will only receive changes for that tenant.

Upvotes: 0

Related Questions