how long can RDDs be persisted in spark

Question

I have written a program where I am persisting the RDD inside spark stream so that once new RDD come from spark stream I can join the previously cached RDDs with the new one. Is there a way to set the time to live for this persisted RDDs, so that I can make sure I am not joining the RDDs which I have already got in the last stream cycle.

Also it will be great if someone can explain and point to how once persistance in RDDs work, like when I get the persisted RDDs from spark context how can I join these RDDs across my present RDDs.

how long can RDDs be persisted in spark

Answers (1)

Related Questions