zwb
zwb

Reputation: 956

Is the DStream return by updateStateByKey function only contains one RDD?

Is the DStream return by updateStateByKey function only contains one RDD? If not,Under what circumstances will the DStream contains more than one RDD?

Upvotes: 2

Views: 844

Answers (3)

zwb
zwb

Reputation: 956

Yes, the DStream return by updateStateByKey only hava one RDD

Upvotes: 0

zwb
zwb

Reputation: 956

it seemed not like what you said, the code as a part of application bleow only print once every batch, so i think every stateful DStream just have only one RDD

@transient val statefulDStream = lines.transform(...).map(x => (x, 1)).updateStateByKey(updateFuncs)

statefulDStream.foreachRDD { rdd =>
  println(rdd.first())
}

Upvotes: 0

Wesley Miao
Wesley Miao

Reputation: 861

It contains a RDD every batch. The DStream returned by updateStateByKey is a "state" DStream. You can still view this DStream as a normal DStream though. For every batch, the RDD is representing the latest state (key-value pairs) according to your update function that you pass in to updateStateByKey.

Upvotes: 2

Related Questions