Rand Abu Salim
Rand Abu Salim

Reputation: 65

spark cache manager behavior

I'm trying to understand the spark cache manager behavior as I deployed my test code to spark job server to have long running context and want to test the behavior by executing the same job multiple time after each other to see how caching is.

val manager = spark.sharedState.cacheManager
val DF = collectData.retrieveDataFromCass(spark) // loaded from cassandra sucessfully with 2k rows
 val testCachedData = if (manager.lookupCachedData(DF.queryExecution.logical).isEmpty) 0 else 1
 DF.createOrReplaceTempView(tempName1)
 spark.sqlContext.cacheTable(tempName1)
  DF.count() // action 
testCachedData

Then I'm returning testCachedData.

I've expected to see testCachedData in the first job execution to see it 0 then in the next tries to be returning 1 But I've got all job returning it as 0 as it seems empty each time, But when I checked it from the spark UI STORAGE I could see there's a cache data there.

Why cache manager can't see my cache data in the same spark application ?

THIS SPARK TEST IS USING : SPARK 3.2 spark-cassandra-connector 3.0.1

Upvotes: 0

Views: 241

Answers (0)

Related Questions