Reputation: 1803
code:
val ssc = new StreamingContext(sc,Seconds(1)) //interval time is 1 second
val lines = ssc.socketTextStream(args(0),args(1).toInt,StorageLevel.MEMORY_ONLY_SER)
val wc = lines.window(Seconds(3)) //the windowDuration is 3 seconds.
Look at the code above, the interval time is 1 second, and windowDuration is 3 seconds, with time elapsing, there will be data of 3 time interval in the window, can I get data of each time interval for different processing? Like the mapPartitions() or mapPartitionsWithIndex(), I can process each partition in RDD.
Does anyone know? Could you tell me? Thank you!
Upvotes: 0
Views: 399
Reputation: 754
The main point of window is to "combine" all RDDs within window duration into single RDD so you can aggregate data (internally it does union of RDDs within the window width). If you want to work with each RDD in each time interval individually don't define window and keep working with lines
. Then you can run for example lines.foreachRDD(...)
and it will run on each RDD individually
Upvotes: 2