Jonreyan
Jonreyan

Reputation: 131

reshape dataframe from column to rows in scala

I want to reshape a dataframe in Spark using scala . I found most of the example uses groupBy andpivot. In my case i dont want to use groupBy. This is how my dataframe looks like

  tagid           timestamp value
1     1 2016-12-01 05:30:00     5
2     1 2017-12-01 05:31:00     6
3     1 2017-11-01 05:32:00     4
4     1 2017-11-01 05:33:00     5
5     2 2016-12-01 05:30:00   100
6     2 2017-12-01 05:31:00   111
7     2 2017-11-01 05:32:00   109
8     2 2016-12-01 05:34:00    95

And i want my dataframe to look like this,

            timestamp  1  2 
1 2016-12-01 05:30:00  5 100
2 2017-12-01 05:31:00  6 111
3 2017-11-01 05:32:00  4 109
4 2017-11-01 05:33:00  5  NA
5 2016-12-01 05:34:00 NA  95

i used pivot without groupBy and it throws error.

df.pivot("tagid")

error: value pivot is not a member of org.apache.spark.sql.DataFrame.

How do i convert this? Thank you.

Upvotes: 0

Views: 1288

Answers (1)

Ramesh Maharjan
Ramesh Maharjan

Reputation: 41957

Doing the following should solve your issue.

df.groupBy("timestamp").pivot("tagId").agg(first($"value"))

you should have final dataframe as

+-------------------+----+----+
|timestamp          |1   |2   |
+-------------------+----+----+
|2017-11-01 05:33:00|5   |null|
|2017-11-01 05:32:00|4   |109 |
|2017-12-01 05:31:00|6   |111 |
|2016-12-01 05:30:00|5   |100 |
|2016-12-01 05:34:00|null|95  |
+-------------------+----+----+

for more information you can checkout databricks blog

Upvotes: 2

Related Questions