Reputation: 49
I want to covert row into column using spark dataframe.
My table is like this
Eno,Name
1,A
1,B
1,C
2,D
2,E
I want to convert it into
Eno,n1,n2,n3
1,A,B,C
2,D,E,Null
I used this below code :-
val r = spark.sqlContext.read.format("csv").option("header","true").option("inferschema","true").load("C:\\Users\\axy\\Desktop\\abc2.csv")
val n =Seq("n1","n2","n3"
r
.groupBy("Eno")
.pivot("Name",n).agg(expr("coalesce(first(Name),3)").cast("double")).show()
But I am getting result as-->
+---+----+----+----+
|Eno| n1| n2| n3|
+---+----+----+----+
| 1|null|null|null|
| 2|null|null|null|
+---+----+----+----+
Can anyone help to get the desire result.
Upvotes: 0
Views: 112
Reputation: 25909
import org.apache.spark.sql.functions._
import spark.implicits._
val df= Seq((1,"A"),(1,"B"),(1,"C"),(2,"D"),(2,"E")).toDF("Eno","Name")
val getName=udf {(names: Seq[String],i : Int) => if (names.size>i) names(i) else null}
val tdf=df.groupBy($"Eno").agg(collect_list($"name").as("names"))
val ndf=(0 to 2).foldLeft(tdf){(ndf,i) => ndf.withColumn(s"n${i}",getName($"names",lit(i))) }.
drop("names")
ndf.show()
+---+---+---+----+
|Eno| n0| n1| n2|
+---+---+---+----+
| 1| A| B| C|
| 2| D| E|null|
+---+---+---+----+
Upvotes: 0
Reputation: 1114
val m= map(lit("A"), lit("n1"), lit("B"),lit("n2"), lit("C"), lit("n3"), lit("D"), lit("n1"), lit("E"), lit("n2"))
val df= Seq((1,"A"),(1,"B"),(1,"C"),(2,"D"),(2,"E")).toDF("Eno","Name")
df.withColumn("new", m($"Name")).groupBy("Eno").pivot("new").agg(first("Name"))
+---+---+---+----+
|Eno| n1| n2| n3|
+---+---+---+----+
| 1| A| B| C|
| 2| D| E|null|
+---+---+---+----+
Upvotes: 2