Reputation: 383
I have a dataframe with a map:
sdf = spark.createDataFrame(
[
(1, {'Kira':25,'Lilly':15}),
(2, {'Tom':14}),
],
["id", "label"]
)
+---+-------------------------+
|id |label |
+---+-------------------------+
|1 |{Lilly -> 15, Kira -> 25}|
|2 |{Tom -> 14} |
+---+-------------------------+
And I want to put the keys in one column and the values in another, like this:
+---+-----+---+
|id |name |age|
+---+-----+---+
|1 |Kira |25 |
|1 |Lilly|15 |
|2 |Tom |14 |
+---+-----+---+
Upvotes: 0
Views: 343
Reputation: 26676
Long hand. Use map collection functions to create name and age colunms. Leverage inline function to explode
sdf.withColumn('name',map_keys('label')).withColumn('age', map_values('label')).selectExpr('id','inline(arrays_zip(name,age))').show()
+---+-----+---+
| id| name|age|
+---+-----+---+
| 1|Lilly| 15|
| 1| Kira| 25|
| 2| Tom| 14|
+---+-----+---+
Upvotes: 1