Reputation: 385
Is there a way to vary the batch interval duration in Spark Streaming (i.e., depending on some tests in the code) so that it does not stays the same for all the computation time?
When coding in Python, for instance, the batch interval duration is the second argument in
StreamingContext(sparkContext: SparkContext, batchDuration: Duration)
;
e.g., ssc = StreamingContext(sc, 1)
,
and as far as I know it is not going to change during the execution.
Is it possible in Spark to make it mutable during the computation, i.e. accordingly to the output of some tests?
A dumb example of a possible use: in the class newtork_wordcount.py
, increase the batch interval duration in case of a particular string (or line) in the previous batch interval.
I hope I have been clear enough!
Thanks to anybody who will try to help! Have a nice day! :-)
Upvotes: 3
Views: 1336
Reputation: 71
Actually this paper by TD may answer your question. He tried to use dynamic batch interval and get better result.
Upvotes: 0
Reputation: 1659
I don't think you can change the batch size in spark streaming at least that is what Tathagata Das said in one of his talks.
Upvotes: 0