Dewfy
Dewfy

Reputation: 23644

Connect to Kafka installed on HDInsight (Azure)

I need to connect from external java application to Kafka cluster that started as part of HDinsight on Azure. I have cluster with 3 instances of brokers, 3 ZooKeepers and one ZooKeeper client.

Now my question: how to specify broker connection string. On admin panel I can see 3 brokers like: xxx-1.yyy.zzz.internal.cloudapp.net, xxx-2.yyy.zzz.internal.cloudapp.net - but these addresses aren't available from external. If I try it then I can see the exception:

KafkaException: Failed to construct kafka consumer

...

ConfigException: Invalid url in bootstrap.servers: xxx-1.yyy.zzz.internal.cloudapp.net

Upvotes: 5

Views: 2445

Answers (4)

Qihong
Qihong

Reputation: 21

If you connect from an on-premises network, you need to set up a site-to-site VPN gateway, see Connect to Apache Kafka from an on-premises network for details.

If you connect from an individual machine, you need to set up a point-to-site VPN gateway, see Connect to Apache Kafka with a VPN client for details.

Upvotes: 1

[root@domain bin]# ./kafka-console-producer.sh --broker-list host.domain.net:6667 --topic topic1 --security-protocol SASL_PLAINTEXT 
Test
[2017-04-11 09:07:43,821] WARN Error while fetching metadata with correlation id 0 : {topic1=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient)
[2017-04-11 09:07:44,022] WARN Error while fetching metadata with correlation id 1 : {topic1=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient)
[2017-04-11 09:07:44,122] WARN Error while fetching metadata with correlation id 2 : {topic1=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient)
[2017-04-11 09:07:44,223] WARN Error while fetching metadata with correlation id 3 : {topic1=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient)
[2017-04-11 09:07:44,323] WARN Error while fetching metadata with correlation id 4 : {topic1=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient)
[2017-04-11 09:07:44,423] WARN Error while fetching metadata with correlation id 5 : {topic1=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient)
[2017-04-11 09:07:44,523] WARN Error while fetching metadata with correlation id 6 : {topic1=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient)
[2017-04-11 09:07:44,624] WARN Error while fetching metadata with correlation id 7 : {topic1=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient)
[2017-04-11 09:07:43,821] WARN Error while fetching metadata with correlation id 0 : {topic1=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient)
[2017-04-11 09:07:44,022] WARN Error while fetching metadata with correlation id 1 : {topic1=UNKNOWN_TOPIC_OR_PARTITION} (org.apache.kafka.clients.NetworkClient)
[2017-04-11 09:07:44,122] WARN Error while fetching metadata with correlation id 2 : {topic1=UNKNOWN_TOPIC_OR_PARTI

It throws this error if you have 3 broker kafka cluster. So whenever you are running a kafka producer and consumer, create a topic with same replication factor with number of brokers.

Command

bin/kafka-topics.sh --create --topic test --zookeeper node1:2181,node2:2181,node3:2181 --partitions 1 --replication-factor 3

This command is for 3 broker hdinsigt cluster

Upvotes: -3

Sigrist
Sigrist

Reputation: 1471

Check your Kafka configuration and set the property auto.create.topics.enable to true. Restart Kafka and try again.

Upvotes: -1

Jonas
Jonas

Reputation: 139

The issue is that you're trying to resolve internal azure hostnames, which are not resolvable from the internet. Also you need to be aware that it is not possible to directly connect to a Kafka instance from the internet.

You need another layer/gateway in between, as you can see here in this diagram.

As far as I know, you can choose between a direct connection or use another layer in-between, like the Azure IoT Hub and a connector.

You could choose depending on your use-case which one you want, but be aware that these services are not free and depending on your data size it might add a significant position to your receipt.

Upvotes: 4

Related Questions