user12951786
user12951786

Reputation: 11

JDBC Source Connector tasks.max=5, but only one task is created

I have many Debezium/connect JDBC Source connectors (with tasks.max=5) for Oracle DB.

Even though tasks.max=5, it creates only 1 task for one connector. But for other connectors, it creates the mentioned tasks.max =5 tasks. But for this particular connector it creates only 1 task instead of 5. No difference in configurations, but this particular connector is publishing large batches of records to the topic. Below is the connector config.

curl -iX POST -H "Accept:application/json" -H "Content-Type:application/json" https://xxxxxxxxxx/connectors -d '{"name": "my_source_connector",
 "config": {
  "connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
  "tasks.max": "5",
  "connection.url": "jdbc:oracle:thin:@",
  "connection.user": "USER_ID",
  "connection.password": "MyPassowrd001",
  "mode": "timestamp+incrementing",
  "incrementing.column.name": "ID",
  "timestamp.column.name": "LAST_UPDATED_TS",
  "topic.prefix": "",
  "batch.max.rows": "50",
  "timestamp.delay.interval.ms": "240000",
  "table.whitelist": "TABLE001",
  "numeric.mapping": "best_fit"
 }}' -k

If you have any input to get this resolved, it'll be a great time savor for me.

Debezium base image is debezium/connect:1.7 and Kafka version is 2.8.1

As expected, 5 tasks should be created but only 1 is created. Finding why all 5 not created.

Upvotes: 0

Views: 864

Answers (1)

musmil
musmil

Reputation: 53

Database connectors are sourcing table data into kafka topics 1:1, I don't know how you managed to create 5 tasks for your other debezium connectors, are they source connectors or sink connectors? Is it possible only 1 task is active on other connectors?

https://debezium.io/documentation/reference/stable/connectors/oracle.html https://docs.oracle.com/en/cloud/paas/event-hub-cloud/admin-guide/jdbc-source-connector.html

In here it is clearly stated that tasks.max should be 1 and can't be anything else for debezium connector for jdbc, that is why it is excluded on the connector configuration.

Also on the class's configuration page I cannot find the tasks.max option for that reason;

https://docs.confluent.io/kafka-connectors/jdbc/current/source-connector/source_config_options.html#connector

I have never came accross a jdbc source connector which can input database data with more than 1 task, which is something I didn't know was supported until now. Even if it can, how can the tasks know which rows are consumed and sourced to topic? and what happens on retries etc.

I can leave one trick to do if you would like to try is;

Divide you table with queries like one query to A-G and another for H-Z and have two connectors one for each. This will create two tasks sourcing the same table.

Upvotes: 0

Related Questions