Why does my Spark Streaming application not print the number of records from Kafka (using count operator)?

Question

I am working on a spark application which needs to read data from Kafka. I created a Kafka topic where producer was posting messages. I verified from console consumer that messages were successfully posted .

I wrote a short spark application to read data from Kafka, but it is not getting any data. Following is the code i used:

def main(args: Array[String]): Unit = {
   val Array(zkQuorum, group, topics, numThreads) = args
   val sparkConf = new SparkConf().setAppName("SparkConsumer").setMaster("local[2]")
   val ssc = new StreamingContext(sparkConf, Seconds(2))

   val topicMap = topics.split(",").map((_, numThreads.toInt)).toMap
   val lines = KafkaUtils.createStream(ssc, zkQuorum, group, topicMap).map(_._2)

   process(lines) // prints the number of records in Kafka topic

   ssc.start()
   ssc.awaitTermination()
 }

 private def process(lines: DStream[String]) { 
   val z = lines.count()
   println("count of lines is "+z) 
    //edit
   lines.foreachRDD(rdd => rdd.map(println) 
   // <-- Why does this **not** print?
 )

Any suggestions on how to resolve this issue?

******EDIT****

I have used

lines.foreachRDD(rdd => rdd.map(println)

as well in actual code but that is also not working. I set the retention period as mentioned in post : Kafka spark directStream can not get data . But still the problem exist.

Why does my Spark Streaming application not print the number of records from Kafka (using count operator)?

Answers (1)

Related Questions