cronoik
cronoik

Reputation: 19310

readStream kafka doesn't get any values

I'm trying to read a Kafka topic via Spark structured streaming inside spark-shell but it seems like that I don't get any line from Kafka.

Kafka alone works fine (tested with console-consumer and console-producer):

~/opt/bd/kafka/bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic testtopic --from-beginning
Using the ConsoleConsumer with old consumer is deprecated and will be removed in a future major release. Consider using the new consumer by passing [bootstrap-server] instead of [zookeeper].
first
thrid
fifth
seventh
eight
bla
blal2
das ist 
testmaschine
hallo
kleiner
blsllslsd

This is the code I'm running in the spark-shell:

ds1 = spark
  .readStream
  .format("kafka")
  .option("kafka.bootstrap.servers", "localhost:2181")
  .option("subscribe", "testtopic")
  .option("startingOffsets" , "earliest")
  .load()

ds1.writeStream.format("console").start

I'm expecting that I get the messages that are already stored for this topic in Kafka and that all messages will be printed in the Spark shell. But there is nothing printed. Where is my mistake? I'm using Spark 2.0.2 and Kafka 010.2.

Upvotes: 1

Views: 834

Answers (1)

himanshuIIITian
himanshuIIITian

Reputation: 6085

You need to change the port for Kafka bootstrap servers. Like this-

ds1 = spark
  .readStream
  .format("kafka")
  .option("kafka.bootstrap.servers", "localhost:9092")
  .option("subscribe", "testtopic")
  .option("startingOffsets" , "earliest")
  .load()

ds1.writeStream.format("console").start

Then you will be able to get values from readStream.

I hope it helps!

Upvotes: 3

Related Questions