Pramod
Pramod

Reputation: 113

Unable to read the data from kafka topics using spark streaming

I am trying to read the data from the kafka topic using the spark streaming. I am able to produce the message into the kafka topic, but whilereading the data from topic using spark streaming i am getting error message as given below:

ERROR ReceiverTracker: Deregistered receiver for stream 0: Error starting receiver 0 - java.lang.ClassCastException: java.util.HashMap cannot be cast to java.lang.String

below is the code:

from pyspark import SparkConf, SparkContext
from pyspark.streaming import StreamingContext
from pyspark.streaming.kafka import KafkaUtils
import pprint

conf= SparkConf().setAppName(“streaming test”).setMaster(“yarn-client”)
sc=SparkContext(conf=conf)
ssc=StreamingContext(sc,10)

topic = “newone”
broker = {“metadata.broker.list”: “URL”}
direct = KafkaUtils.createStream(ssc,broker,“test”,{topic:1})

direct.pprint()
ssc.start()
ssc.awaitTermination()

Upvotes: 1

Views: 176

Answers (1)

Ram Ghadiyaram
Ram Ghadiyaram

Reputation: 29237

Output Operations on DStreams

print() - Prints the first ten elements of every batch of data in a DStream on the driver node running the streaming application. This is useful for development and debugging. Python API This is called pprint() in the Python API.

java.util.Hashmap is coming in the message i.e. InputDStream[ConsumerRecord[K, V]] you want to print that, while and hence java.lang.ClassCastException

you have to parse the message and print it like this...

 direct.transform(...).map(lambda ...)

examples here :tests.py

Upvotes: 0

Related Questions