Reputation: 65
I'm trying to understand the spark cache manager behavior as I deployed my test code to spark job server to have long running context and want to test the behavior by executing the same job multiple time after each other to see how caching is.
val manager = spark.sharedState.cacheManager
val DF = collectData.retrieveDataFromCass(spark) // loaded from cassandra sucessfully with 2k rows
val testCachedData = if (manager.lookupCachedData(DF.queryExecution.logical).isEmpty) 0 else 1
DF.createOrReplaceTempView(tempName1)
spark.sqlContext.cacheTable(tempName1)
DF.count() // action
testCachedData
Then I'm returning testCachedData.
I've expected to see testCachedData in the first job execution to see it 0 then in the next tries to be returning 1 But I've got all job returning it as 0 as it seems empty each time, But when I checked it from the spark UI STORAGE I could see there's a cache data there.
Why cache manager can't see my cache data in the same spark application ?
THIS SPARK TEST IS USING : SPARK 3.2 spark-cassandra-connector 3.0.1
Upvotes: 0
Views: 241