Reputation: 956
Is the DStream return by updateStateByKey function only contains one RDD? If not,Under what circumstances will the DStream contains more than one RDD?
Upvotes: 2
Views: 844
Reputation: 956
it seemed not like what you said, the code as a part of application bleow only print once every batch, so i think every stateful DStream just have only one RDD
@transient val statefulDStream = lines.transform(...).map(x => (x, 1)).updateStateByKey(updateFuncs)
statefulDStream.foreachRDD { rdd =>
println(rdd.first())
}
Upvotes: 0
Reputation: 861
It contains a RDD every batch. The DStream returned by updateStateByKey is a "state" DStream. You can still view this DStream as a normal DStream though. For every batch, the RDD is representing the latest state (key-value pairs) according to your update function that you pass in to updateStateByKey.
Upvotes: 2