Reputation: 1
Problem:
I have set up a Docker-Compose environment where I have one primary database and multiple replica databases, which are scaled using Docker-Compose's load balancing feature. I'm using Debezium-Kafka CDC to synchronize changes from the primary database to the replica databases. However, I've encountered an issue where the sink connector in Debezium only connects to one replica database at a time when there are changes.
Background:
Primary Database: This is my source database from which I want to capture changes using Debezium.
Replica Databases: I have multiple replica databases that are scaled using Docker-Compose. These replicas are essentially clones of the primary database.
Debezium-Kafka CDC: I'm using Debezium with Kafka to capture and publish database changes.
Issue:
When there are changes in the primary database, Debezium's sink connector only connects to one of the replica databases, which is causing a bottleneck in my synchronization process. Ideally, I'd like the sink connector to distribute changes to all replica databases in parallel.
Question:
How can I configure my Docker-Compose environment and Debezium setup to ensure that the sink connector connects to all instances of the replica database in parallel when there are changes in the primary database?
Docker-Compose Configuration:
Here's a simplified version of my Docker-Compose configuration:
version: '3'
services:
primary-db:
image: my-primary-db-image
ports:
- "5432:5432"
# ...
replica-db:
image: my-replica-db-image
# ...
debezium:
image: debezium/connect
# ...
Debezium Connector Configuration:
My Debezium connector configuration might look something like this:
{
"name": "my-connector",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.hostname": "primary-db",
"database.port": "5432",
"database.user": "user",
"database.password": "password",
"database.dbname": "mydb",
# ...
}
}
What I've Tried:
I've experimented with different configurations, but I'm unable to get Debezium to connect to all replica databases concurrently. It seems like Docker-Compose's load balancing might be causing the issue, but I'm not sure how to work around this.
I'd appreciate any insights, suggestions, or examples of how to configure Docker-Compose and Debezium to achieve parallel connections to all replica databases when changes occur in the primary database. Thank you in advance for your assistance!
Upvotes: 0
Views: 545
Reputation: 303
You can achieve that by creating a connector for each replica:
{
"name": "connector-1",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.hostname": "replica-db-1",
"database.port": "5432",
"database.user": "user",
"database.password": "password",
"database.dbname": "mydb",
# ...
}
}
And
{
"name": "connector-2",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.hostname": "primary-db",
"database.port": "5432",
"database.user": "user",
"database.password": "password",
"database.dbname": "mydb",
# ...
}
}
But you should ensure that your Kafka consumer application is designed to handle any deduplication logic if needed.
Upvotes: 0