spark sql unable to find the database and table which it earlier wrote to

Question

There is a spark component creating a sql table out of transformed data. It successfully saves the data into spark-warehouse under the .db folder. The component also tries to read from existing table in order to not blindly overwrite. While reading, spark is unable to find any database other than default.

sparkVersion: 2.4

val spark: SparkSession = SparkSession.builder().master("local[*]").config("spark.debug.maxToStringFields", 100).config("spark.sql.warehouse.dir", "D:/Demo/spark-warehouse/").getOrCreate()

def saveInitialTable(df:DataFrame) {
  df.createOrReplaceTempView(Constants.tempTable)
  spark.sql("create database " + databaseName)
  spark.sql(
    s""" create table if not exists $databaseName.$tableName
      |using parquet partitioned by (${Constants.partitions.mkString(",")})
      |as select * from ${Constants.tempTable}""".stripMargin)
}

 def deduplication(dataFrame: DataFrame): DataFrame ={

  if(Try(spark.sql("show tables from " + databaseName)).isFailure){
    //something 
   }
 }

After saveInitialTable function is performed successfully. In the second run, the deduplication function still is not able to pick up the

I am not using hive explicitly anywhere, just spark DataFrames and SQL API.

When I run the repl in the same directory as spark-warehouse, it too gives on default database.

scala> spark.sql("show databases").show()
2021-10-07 18:45:57 WARN  ObjectStore:6666 - Version information not found in metastore. 
hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
2021-10-07 18:45:57 WARN  ObjectStore:568 - Failed to get database default, returning 
NoSuchObjectException
+------------+
|databaseName|
+------------+
|     default|
+------------+

spark sql unable to find the database and table which it earlier wrote to

Answers (0)

Related Questions