Sudarshan kumar
Sudarshan kumar

Reputation: 1585

Not able to start Kafka connect for Elastic search in Distributed mode

I am trying to start Kafka connect in distributed mode even in standalone also i am not able to proceed

This is my elastic search sink properties

name=elasticsearch-sink
connector.class=io.confluent.connect.elasticsearch.ElasticsearchSinkConnector
tasks.max=5
topics=fsp-audit
key.ignore=true
connection.url=https://****.amazonaws.com
type.name=kafka-connect
errors.tolerance = all
errors.deadletterqueue.topic.name = fsp-dlq-audit-event

this is my connect-distributed.properties

bootstrap.servers=***:9092,***:9092,***:9092
group.id=connect-cluster
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=false
value.converter.schemas.enable=false
offset.storage.topic=connect-offsets
offset.storage.replication.factor=1
schema.enabled=false
config.storage.topic=connect-configs
config.storage.replication.factor=1
status.storage.topic=connect-status
status.storage.replication.factor=1
offset.flush.interval.ms=10000
plugin.path=/usr/local/confluent/share/java

I also have created three topi upfront

connect-offsets
connect-configs
connect-status

I am running this on EC2 and using MSK as Kafka . I checked connectivity from my EC2 to MSK and i am able to telent

The error i get this

[2020-01-30 08:53:12,126] INFO [AdminClient clientId=adminclient-1] Metadata update failed (org.apache.kafka.clients.admin.internals.AdminMetadataManager:237)
org.apache.kafka.common.errors.TimeoutException: Timed out waiting to send the call.
[2020-01-30 08:53:12,145] INFO [AdminClient clientId=adminclient-1] Metadata update failed (org.apache.kafka.clients.admin.internals.AdminMetadataManager:237)
org.apache.kafka.common.errors.TimeoutException: Timed out waiting to send the call.
[2020-01-30 08:53:12,149] ERROR Stopping due to error (org.apache.kafka.connect.cli.ConnectDistributed:83)
org.apache.kafka.connect.errors.ConnectException: Failed to connect to and describe Kafka cluster. Check worker's broker connection and security properties.
        at org.apache.kafka.connect.util.ConnectUtils.lookupKafkaClusterId(ConnectUtils.java:64)
        at org.apache.kafka.connect.util.ConnectUtils.lookupKafkaClusterId(ConnectUtils.java:45)
        at org.apache.kafka.connect.cli.ConnectDistributed.startConnect(ConnectDistributed.java:94)
        at org.apache.kafka.connect.cli.ConnectDistributed.main(ConnectDistributed.java:77)
Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment.
        at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
        at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
        at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
        at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)
        at org.apache.kafka.connect.util.ConnectUtils.lookupKafkaClusterId(ConnectUtils.java:58)

Question :If i have to run Kafka connect in distributed mode,do i have to use more than one EC2/vm ?

Upvotes: 0

Views: 1137

Answers (2)

Sudarshan kumar
Sudarshan kumar

Reputation: 1585

Ok after looking into more details i found out that issue was in NACL which was blocking few subnets ip addresses .

So, I checked the MSK side Security group/Network ACL/Route table and found them to be fine. This means that the issue could be with the EC2 instance so I checked the Security group/Route table of the instance and found them to be properly configured.

However, on checking the Network ACL (acl-***) attached with the EC2 instance, I saw that there is an Inbound rule allowing 0.0.0.0/0 for the ephemeral ports which should allow the brokers to talk to the EC2 instance. However, looking at the Outbound rules, I saw it allowing only the subnet range where b-2 is present but it didn't have any explicit Outbound rule to allow either b-3 (10.**.**.0/24) or b-4 (10.**.**.0/24) subnet ranges.
When i added new rule then i was able to ping and connect success fully

Upvotes: 1

OneCricketeer
OneCricketeer

Reputation: 191728

If i have to run Kafka connect in distributed mode,do i have to use more than one EC2/vm ?

No, you can run one "psuedo-distributed" instance. The main differerence between standalone and distributed is how it handles storage of offsets and configurations.

Upvotes: 0

Related Questions