Guo
Guo

Reputation: 1803

Can I get data of each time interval in window in spark streaming?

code:

val ssc = new StreamingContext(sc,Seconds(1)) //interval time is 1 second
val lines = ssc.socketTextStream(args(0),args(1).toInt,StorageLevel.MEMORY_ONLY_SER)
val wc = lines.window(Seconds(3))  //the windowDuration is 3 seconds.

Look at the code above, the interval time is 1 second, and windowDuration is 3 seconds, with time elapsing, there will be data of 3 time interval in the window, can I get data of each time interval for different processing? Like the mapPartitions() or mapPartitionsWithIndex(), I can process each partition in RDD.

Does anyone know? Could you tell me? Thank you!

Upvotes: 0

Views: 399

Answers (1)

Alex Larikov
Alex Larikov

Reputation: 754

The main point of window is to "combine" all RDDs within window duration into single RDD so you can aggregate data (internally it does union of RDDs within the window width). If you want to work with each RDD in each time interval individually don't define window and keep working with lines. Then you can run for example lines.foreachRDD(...) and it will run on each RDD individually

Upvotes: 2

Related Questions