Reputation: 1547
I have an instance of org.apache.spark.rdd.RDD[MyClass]. How can I programmatically check if the instance is persist\inmemory?
Upvotes: 9
Views: 3387
Reputation: 13835
You can call rdd.getStorageLevel.useMemory to check if it is in memory or not as follows:
scala> myrdd.getStorageLevel.useMemory
res3: Boolean = false
scala> myrdd.cache()
res4: myrdd.type = MapPartitionsRDD[2] at filter at <console>:29
scala> myrdd.getStorageLevel.useMemory
res5: Boolean = true
Upvotes: 2
Reputation: 67065
You want RDD.getStorageLevel
. It will return StorageLevel.None
if empty. However that is only if it is marked for caching or not. If you want the actual status you can use the developer api sc.getRDDStorageInfo
or sc.getPersistentRDD
Upvotes: 12