Reputation: 7275
Here's my input
+----+-----+---+------+----+------+-------+--------+
|year|month|day|new_ts|hour|minute|ts_rank| label|
+----+-----+---+------+----+------+-------+--------+
|2022| 1| 1| 13| 13| 24| 1| 7|
|2022| 1| 1| 14| 13| 24| 1| 8|
|2022| 1| 2| 15| 13| 24| 1| 7|
|2022| 1| 2| 16| 13| 44| 7| 8|
+----+-----+---+------+----+------+-------+--------+
Here's my output
+----+-----+---+-------+--------+
|year|month|day| 7 | 8|
+----+-----+---+-------+--------+
|2022| 1| 1| 13| 14|
|2022| 1| 2| 15| 16|
+----+-----+---+-------+--------+
Here's the pandas code
df_pivot = df.pivot(index=["year","month","day"], columns="label", values="new_ts").reset_index()
What I try
df_pivot = df.groupBy(["year","month","day"]).pivot("label").value("new_ts")
Note: sorry I can't show my error message here, because I'm using cloud solution and its only show the line of error not error message
Upvotes: 1
Views: 32
Reputation: 26676
df.groupBy("year","month","day").pivot('label').agg(first('new_ts')).show()
+----+-----+---+---+---+
|year|month|day| 7| 8|
+----+-----+---+---+---+
|2022| 1| 1| 13| 14|
|2022| 1| 2| 15| 16|
+----+-----+---+---+---+
Upvotes: 1