Reputation: 1341
What I am trying ot achieve is basically print "hello world"
each time I receive a stream of data.
I know that on each stream I can call the function foreachRDD
but that does not help me because:
Basicaly, each time the program tries to fetch data (and it does so every 30 seconds lets say because of the spark streaming context) I would like to print hello.
Is there a way of doing this? is there like a onlisten event for spark streaming?
Upvotes: 2
Views: 3998
Reputation: 149518
Each batch interval (in your case, 30 seconds) the DStream
will contain one and only one RDD
, which internally is divided by several partitions. You can check if it's not empty and only then print hello world:
// Create DStream from source
dstream.foreachRDD { rdd => if (!rdd.isEmpty) println("hello world") }
Upvotes: 4