Reputation: 13
I have written below code to connect to kinesis from spark streaming but there is no data been received.
val kinesisStream = KinesisUtils.createStream(ssc, appName, streamName, endpointUrl, regionName, InitialPositionInStream.LATEST, batchInterval , StorageLevel.MEMORY_AND_DISK_2)
kinesisStream.print() // nothing getting printed here
val data = kinesisStream.flatMap(byteArray => new String(byteArray))
data.foreachRDD { rdd =>
println("data==" + rdd.collect().length) // no data here too
rdd.collect()//.saveAsTextFile("file:///home/myHome/Code/sample/somedata.txt");
}
I tried to write into S3 and to file system, it writes file name by folder and in side that I see only _SUCCESS file which is of zero byte.
by the way, I can able to write to same kinesis stream and read data from java
what is the issue here.
Upvotes: 0
Views: 416
Reputation: 13
I got the solution for this question.
Code could able to pull the data from kinesis. along with the data it also generated lot of zero byte part files. as its the streaming application data part files getting generated for the given interval hence if the data not available for that interval it generated zero bytes files.
Add a check to remove empty part files in DF code, so that DF can only write part files which has the data.
We started getting the data after this changes.
Upvotes: 0