dmdevito
dmdevito

Reputation: 51

Are memory-persisted RDD unpersisted at the end of a Spark streaming micro-batch?

I use Spark 2.0.2 (in DSE / DataStaX Enterprise Edition 5.1) for running some streaming app.

My Spark streaming app does, for each micro-batch, some calls to RDD.persist(), and the RDD.unpersist() is NEVER called (so far, we rely on LRU capabilities of the cache space for unpersisting).

I thought I would see a list of persisted RDD growing quite a bit in the "Storage" tab within the Spark UI.

However, I see only a VERY limited list of persisted RDD within this "Storage" tab of Spark UI. Let's say 10 max persisted RDD and 1,5 MB each => 15 MB occupied space for persisted RDD, quite a limited amount of space as each executor has 1,5 GB of heap.

So I wonder: Are memory-persisted RDD unpersisted at the end of a Spark streaming micro-batch ?

Thanks.

Upvotes: 0

Views: 388

Answers (1)

Sandhya
Sandhya

Reputation: 108

Spark won't unpersist rdds at the end of the batch. GC will clean up the RAM in the LRU basis.

Upvotes: 0

Related Questions