Reputation: 1
I have a scala dataframe like this
-----------------------------------
| Code | Name | Age | ID |
-----------------------------------
| ABC | Alan | 22 | 111111 |
| ABC | Bob | 25 | 222222 |
| DEF | Charlie | 29 | 333333 |
| GHI | David | 11 | 555555 |
-----------------------------------
I want to have an output HashMap like this:
{
'ABC': [{'Name': 'Alan', 'Age': 22', 'ID': 111111} , {'Name': 'Bob', 'Age': 25', 'ID': 22222}],
'DEF': [{'Name': 'Charlie', 'Age': 29', 'ID': 333333}],
'GHI': [{'Name': 'David', 'Age': 11', 'ID': 555555}]
}
How can I efficiently do this?
Upvotes: 0
Views: 189
Reputation: 1322
Assuming your DataFrame is named ds
, this should work:
ds.select('code, to_json(struct('name, 'age, 'id)) as "json")
.groupBy('code).agg(collect_list('json))
.as[(String, Array[String])]
.collect.toMap
This will give you a Map[String, Array[String]]
. If what you wanted was to turn the whole DataFrame into a single JSON, I wouldn't recommend that, but it would be doable as well.
Upvotes: 1