Reputation: 51
For spark's RDD object this is quite trivial as it exposes a getStorageLevel method, but DF does not seem to expose anything similar. anyone?
Upvotes: 3
Views: 5495
Reputation: 4623
You can check weather a DataFrame is cached or not using Catalog (org.apache.spark.sql.catalog.Catalog)
which comes in Spark 2.
Code example :
val sparkSession = SparkSession.builder.
master("local")
.appName("example")
.getOrCreate()
val df = sparkSession.read.csv("src/main/resources/sales.csv")
df.createTempView("sales")
//interacting with catalog
val catalog = sparkSession.catalog
//print the databases
catalog.listDatabases().select("name").show()
// print all the tables
catalog.listTables().select("name").show()
// is cached
println(catalog.isCached("sales"))
df.cache()
println(catalog.isCached("sales"))
Using the above code you can list all the tables and check weather a table is cached or not.
You can check the working code example here
Upvotes: 5