Renukaradhya
Renukaradhya

Reputation: 862

Inconsistent connector state: ConnectException: Task already exists in this worker

I am using Confluent Platform 3.2. Running 3 workers on 3 different EC2 machines.

I had a connector(debezium/MySQL source) which I deleted and started again after few minutes. But I was not able to start the connector successfully because of the below error. The connector is in the failed state. I had to restart the workers to clear the issue.

Need to know whether this is an issue with the caching? How to resolve this issue without restarting the workers. Any support is appreciated.

   {
   "name": "debezium-connector",
   "connector": {
      "state": "RUNNING",
      "worker_id": "xx.xx.xx.xxx:8083"
   },
   "tasks": [
      {
         "state": "FAILED",
         "trace": "org.apache.kafka.connect.errors.ConnectException: Task already exists in this worker: debezium-connector-0\n\tat org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:308)\n\tat org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:834)\n\tat org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1500(DistributedHerder.java:101)\n\tat org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:848)\n\tat org.apache.kafka.connect.runtime.distributed.DistributedHerder$13.call(DistributedHerder.java:844)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)\n\tat java.lang.Thread.run(Thread.java:745)\n",
         "id": 0,
         "worker_id": "xx.xx.xx.xxx:8083"
      }
   ]
}

Upvotes: 4

Views: 2210

Answers (1)

eddyP23
eddyP23

Reputation: 6863

Hmm. I had the same error and then I found that one of the Kafka servers ran out of disk space, therefore Kafka cluster wasn't functioning properly. Don't know all the details here, but I expect Connect is storing some info about connectors and tasks in Kafka and if it is not responding properly, Kafka could still have info about the old task.

Sharing in case that helps anyone else.

EDIT:

I also noticed, that this issue happens to my Kafka nodes from time to time, bringing the whole cluster to unusable state. Restarting troubled node fixes the issues.

Upvotes: 1

Related Questions