Arush70
Arush70

Reputation: 49

Converting row into column using spark scala

I want to covert row into column using spark dataframe.

My table is like this

Eno,Name
1,A
1,B
1,C
2,D
2,E

I want to convert it into

Eno,n1,n2,n3
1,A,B,C
2,D,E,Null

I used this below code :-

val r = spark.sqlContext.read.format("csv").option("header","true").option("inferschema","true").load("C:\\Users\\axy\\Desktop\\abc2.csv")

val n =Seq("n1","n2","n3"

 r
    .groupBy("Eno")
    .pivot("Name",n).agg(expr("coalesce(first(Name),3)").cast("double")).show() 

But I am getting result as-->

+---+----+----+----+
|Eno|  n1|  n2|  n3|
+---+----+----+----+
|  1|null|null|null|
|  2|null|null|null|
+---+----+----+----+

Can anyone help to get the desire result.

Upvotes: 0

Views: 112

Answers (2)

Arnon Rotem-Gal-Oz
Arnon Rotem-Gal-Oz

Reputation: 25909

import org.apache.spark.sql.functions._
import spark.implicits._
val df= Seq((1,"A"),(1,"B"),(1,"C"),(2,"D"),(2,"E")).toDF("Eno","Name")
val getName=udf {(names: Seq[String],i : Int) => if (names.size>i)  names(i) else null}

val tdf=df.groupBy($"Eno").agg(collect_list($"name").as("names"))
val ndf=(0 to 2).foldLeft(tdf){(ndf,i) => ndf.withColumn(s"n${i}",getName($"names",lit(i))) }.
drop("names")
ndf.show()
+---+---+---+----+
|Eno| n0| n1|  n2|
+---+---+---+----+
|  1|  A|  B|   C|
|  2|  D|  E|null|
+---+---+---+----+

Upvotes: 0

mikeL
mikeL

Reputation: 1114

val m= map(lit("A"), lit("n1"), lit("B"),lit("n2"), lit("C"), lit("n3"), lit("D"), lit("n1"), lit("E"), lit("n2"))
val df= Seq((1,"A"),(1,"B"),(1,"C"),(2,"D"),(2,"E")).toDF("Eno","Name")
df.withColumn("new", m($"Name")).groupBy("Eno").pivot("new").agg(first("Name"))


+---+---+---+----+
|Eno| n1| n2|  n3|
+---+---+---+----+
|  1|  A|  B|   C|
|  2|  D|  E|null|
+---+---+---+----+

Upvotes: 2

Related Questions