Surender Raja
Surender Raja

Reputation: 3599

What happens to persisted RDD once Spark Job finishes?

What happens to a persisted Spark RDD once the spark job gets finished with Success?

Do we need to explicitly write some Code to unpersist as well?

or

Does unpersisting happens automatically for each persisted RDD?

Upvotes: 1

Views: 972

Answers (2)

Sanket_patil
Sanket_patil

Reputation: 300

Do we need to explicitly write some Code to unpersist as well?

Yes

Does unpersisting happens automatically for each persisted RDD?

No, You need to do it explicitly by calling
RDD.unpersist()
or
df1.unpersist()
and always unpersist the df after the end of the lineage, after the last action that involves the persisted/cached df.

Upvotes: 2

berrytchaks
berrytchaks

Reputation: 849

The official documentation of spark says

Spark automatically monitors cache usage on each node and drops out old data partitions in a least-recently-used (LRU) fashion. If you would like to manually remove an RDD instead of waiting for it to fall out of the cache, use the RDD.unpersist() method.

Please take a look at http://spark.apache.org/docs/latest/programming-guide.html#removing-data

Upvotes: 1

Related Questions