Reputation: 1
I read table statistics from a metastore starting spark application setting up hive.metastore.uris. However I need write data to another hive.
I've tryed to clean Active and Default Session, build another session with the new metastore uri, but spark continues trying write to the table of the first hive.
val spark = SparkSession.builder()
.appName(appName)
.enableHiveSupport()
.config("hive.metastore.uris", FIRST_METASTORE)
.config("spark.sql.hive.convertMetastoreOrc", "false")
.config("spark.sql.caseSensitive", "false")
.config("hive.exec.dynamic.partition", "true")
.config("hive.exec.dynamic.partition.mode", "nonstrict")
.getOrCreate()
val df = spark.sql("DESCRIBE FORMATTED source_table")
SparkSession.clearActiveSession()
SparkSession.clearDefaultSession()
val spark2 = SparkSession.builder()
.appName(appName)
.enableHiveSupport()
.config("hive.metastore.uris", NEW_MESTASTORE)
.config("spark.sql.hive.convertMetastoreOrc", "false")
.config("spark.sql.caseSensitive", "false")
.config("hive.exec.dynamic.partition", "true")
.config("hive.exec.dynamic.partition.mode", "nonstrict")
.getOrCreate()
SparkSession.setDefaultSession(sparkSession2)
SparkSession.setActiveSession(sparkSession2)
df.write
.format("parquet")
.mode(SaveMode.Overwrite)
.insertInto("other_cluster_table")
}
As I said, it would be expected that dataframe should be wrote to the table location of the new metastore and catalog, but it doesn't. This happens because interface DataFrameWriter get information from df.sparkSession.sessionState.sqlParser.parseTableIdentifier(tableName)
in order to insert into some existent table, but how could I deal with it?
Upvotes: 0
Views: 719
Reputation: 1
After reading about multiple sparkContexts, I solve this question just writing the parquet directly to namenode/directory/to/partition/ and then adding partition to table using beeline.
Upvotes: 0