Reputation: 181
I have a table in hive whose DDL looks like this.
CREATE TABLE ABC(
name
string)
PARTITIONED BY (
col1
string,
col2
bigint,
col3
string,
col4
string)
I have a requirement in which I have to store the non partition column name of hive table into variable1 and partition column name into variable2 using spark scala.
Desired output would be :
variable1='name'
variable2='col1,col2,col3,col4'
I am following the below approach but not able to get the same.
val df=sql("desc default.ABC")
val df2=df.map(r => r.getString(0)).collect.toList
List[String] = List(name, col1, col2, col3, col4, # Partition Information, # col_name, col1, col2, col3, col4)
Can you please help me with the approach?
Upvotes: 3
Views: 3273
Reputation: 175
try this :
import org.apache.spark.sql.functions._
val partitionsColumns = spark.catalog.listColumns("wikicc").where(col("ispartition") === true).select("name").collect().map(_.getAs[String]("name"))
val noParitionsColumns = spark.catalog.listColumns("wikicc").where(col("ispartition") === false).select("name").collect().map(_.getAs[String]("name"))
println(partitionsColumns.mkString(","))
println(noParitionsColumns.mkString(","))
Upvotes: 3
Reputation: 143
Try it:
val df=sql("show partitions default.ABC")
you can get 'partition' column. then collect get you want.
If you want create table partition,you can use spark-sql execute
show create table tableName
you can get 'createtab_stmt' column of Dataset.
Upvotes: 0