Clay
Clay

Reputation: 2736

clearCache in pyspark without SQLContext

Considering the pySpark documentation for SQLContext says "As of Spark 2.0, this is replaced by SparkSession."

How can I remove all cached tables from the in-memory cache without using SQLContext?

For example, where spark is a SparkSession and sc is a sparkContext:

from pyspark.sql import SQLContext
SQLContext(sc, spark).clearCache()

Upvotes: 3

Views: 5877

Answers (1)

abiratsis
abiratsis

Reputation: 7336

I don't think that clearCache is available elsewhere except SQLContext in pyspark. The example below create an instance using SQLContext.getOrCreate using an existing SparkContext instance:

SQLContext.getOrCreate(sc).clearCache()

In scala though there is an easier way to achieve the same directly via SparkSession:

spark.sharedState.cacheManager.clearCache()

One more option through the catalog as Clay mentioned:

spark.catalog.clearCache

And the last one from Jacek Laskowski's gitbooks:

spark.sql("CLEAR CACHE").collect

Reference: https://jaceklaskowski.gitbooks.io/mastering-spark-sql/spark-sql-caching-and-persistence.html

Upvotes: 1

Related Questions