Reputation: 9771
I have a simple Spark streaming app that works with kafka(deploy on my machine, as in the basic config that ships with the distribution). When i run my sparkstreaming app on a standalone server with my master and worker on my machine and therefore the same machine as kafka everything is fine.
However as soon as i decide to add another node/worker, or if i simply only start the worker on my second machine (Where Kafka is not) nothing happen anymore. The Streaming tab disappear. But i don't see any error in the stderr of the driver or the worker in the ui.
With no error i just don't know where to look at. The application just does not work.
If anyone has ever experience something of the sort, please would you share some suggestions?
I use the proper machine ip adress of my local network
Upvotes: 1
Views: 231
Reputation: 2130
A possible issue which would cause this behaviour is a misconfiguration of the Kafka advertised host.
By default a Kafka broker advertise itself using whatever java.net.InetAddress.getCanonicalHostName()
. The returned address might not be reachable from the node running the Spark worker.
In order to fix the issue you should set the advertised address on each Kafka broker to be reachable from all the nodes.
The relevant Kafka broker configuration options are:
advertised.host.name
advertised.listeners
(with fallback on advertised.host.name
)For further details on these configuration parameters refer to the Kafka documentation for version 0.9 or 0.10
Upvotes: 1