Sumit D
Sumit D

Reputation: 181

How to get the hive partition column name using spark

I have a table in hive whose DDL looks like this.

CREATE TABLE ABC( name string) PARTITIONED BY ( col1 string, col2 bigint, col3 string, col4 string)

I have a requirement in which I have to store the non partition column name of hive table into variable1 and partition column name into variable2 using spark scala.

Desired output would be :

 variable1='name'    

 variable2='col1,col2,col3,col4'

I am following the below approach but not able to get the same.

val df=sql("desc default.ABC")

val df2=df.map(r => r.getString(0)).collect.toList

List[String] = List(name, col1, col2, col3, col4, # Partition Information, # col_name, col1, col2, col3, col4)

Can you please help me with the approach?

Upvotes: 3

Views: 3273

Answers (2)

Francoceing C
Francoceing C

Reputation: 175

try this :

import org.apache.spark.sql.functions._
val partitionsColumns = spark.catalog.listColumns("wikicc").where(col("ispartition") === true).select("name").collect().map(_.getAs[String]("name"))
val noParitionsColumns = spark.catalog.listColumns("wikicc").where(col("ispartition") === false).select("name").collect().map(_.getAs[String]("name"))
println(partitionsColumns.mkString(","))
println(noParitionsColumns.mkString(","))

Upvotes: 3

Pookly
Pookly

Reputation: 143

Try it:

val df=sql("show partitions default.ABC")

you can get 'partition' column. then collect get you want.

If you want create table partition,you can use spark-sql execute

show create table tableName

you can get 'createtab_stmt' column of Dataset.

Upvotes: 0

Related Questions