Reputation: 19310
I'm trying to read a Kafka topic via Spark structured streaming inside spark-shell but it seems like that I don't get any line from Kafka.
Kafka alone works fine (tested with console-consumer and console-producer):
~/opt/bd/kafka/bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic testtopic --from-beginning
Using the ConsoleConsumer with old consumer is deprecated and will be removed in a future major release. Consider using the new consumer by passing [bootstrap-server] instead of [zookeeper].
first
thrid
fifth
seventh
eight
bla
blal2
das ist
testmaschine
hallo
kleiner
blsllslsd
This is the code I'm running in the spark-shell:
ds1 = spark
.readStream
.format("kafka")
.option("kafka.bootstrap.servers", "localhost:2181")
.option("subscribe", "testtopic")
.option("startingOffsets" , "earliest")
.load()
ds1.writeStream.format("console").start
I'm expecting that I get the messages that are already stored for this topic in Kafka and that all messages will be printed in the Spark shell. But there is nothing printed. Where is my mistake? I'm using Spark 2.0.2 and Kafka 010.2.
Upvotes: 1
Views: 834
Reputation: 6085
You need to change the port for Kafka bootstrap servers. Like this-
ds1 = spark
.readStream
.format("kafka")
.option("kafka.bootstrap.servers", "localhost:9092")
.option("subscribe", "testtopic")
.option("startingOffsets" , "earliest")
.load()
ds1.writeStream.format("console").start
Then you will be able to get values from readStream
.
I hope it helps!
Upvotes: 3