Mohan
Mohan

Reputation: 13

Unable to read data from spark streaming connecting Kinesis

I have written below code to connect to kinesis from spark streaming but there is no data been received.

val kinesisStream = KinesisUtils.createStream(ssc, appName, streamName, endpointUrl, regionName, InitialPositionInStream.LATEST, batchInterval , StorageLevel.MEMORY_AND_DISK_2)

kinesisStream.print() // nothing getting printed here 

val data = kinesisStream.flatMap(byteArray => new String(byteArray))

data.foreachRDD { rdd =>          
      println("data==" + rdd.collect().length) // no data here too
      rdd.collect()//.saveAsTextFile("file:///home/myHome/Code/sample/somedata.txt");          
    }

I tried to write into S3 and to file system, it writes file name by folder and in side that I see only _SUCCESS file which is of zero byte.

by the way, I can able to write to same kinesis stream and read data from java

what is the issue here.

Upvotes: 0

Views: 416

Answers (1)

Mohan
Mohan

Reputation: 13

I got the solution for this question.

Code could able to pull the data from kinesis. along with the data it also generated lot of zero byte part files. as its the streaming application data part files getting generated for the given interval hence if the data not available for that interval it generated zero bytes files.

Add a check to remove empty part files in DF code, so that DF can only write part files which has the data.

We started getting the data after this changes.

Upvotes: 0

Related Questions