Shivansh
Shivansh

Reputation: 3544

How to read records from Kafka topic from beginning in Spark Streaming?

I am trying to read records from a Kafka topic using Spark Streaming.

This is my code:

object KafkaConsumer {

  import ApplicationContext._

  def main(args: Array[String]) = {

    val kafkaParams = Map[String, Object](
      "bootstrap.servers" -> "localhost:9092",
      "key.deserializer" -> classOf[StringDeserializer],
      "value.deserializer" -> classOf[StringDeserializer],
      "group.id" -> s"${UUID.randomUUID().toString}",
      "auto.offset.reset" -> "earliest",
      "enable.auto.commit" -> (false: java.lang.Boolean)
    )

    val topics = Array("pressure")
    val stream = KafkaUtils.createDirectStream[String, String](
      streamingContext,
      PreferConsistent,
      Subscribe[String, String](topics, kafkaParams)
    )
    stream.print()
    stream.map(record => (record.key, record.value)).count().print()
    streamingContext.start()
  }
}

It displays nothing when I run this.

To check if data is actually present in the pressure topic, I used the command line approach and it does display records:

bin/kafka-console-consumer.sh \
  --bootstrap-server localhost:9092 \
  --topic pressure \
  --from-beginning

Output:

TimeStamp:07/13/16 15:20:45:226769,{'Pressure':'834'}
TimeStamp:07/13/16 15:20:45:266287,{'Pressure':'855'}
TimeStamp:07/13/16 15:20:45:305694,{'Pressure':'837'}

What's wrong?

Upvotes: 10

Views: 8606

Answers (2)

phantomastray
phantomastray

Reputation: 449

You need to start the streamingContext and finally do streamingContext.awaitTermination().

Upvotes: -1

Yuval Itzchakov
Yuval Itzchakov

Reputation: 149538

You're missing streamingContext.awaitTermination().

Upvotes: 7

Related Questions