Reputation: 45
Here is my df:
text | date | channel | sentiment | product | segment | |
---|---|---|---|---|---|---|
0 | I like the new layout | 2021-08-30T18:15:22Z | Snowflake | predict | Skills | EMEA |
I need to convert this to JSON output that matches the following:
[
{
"text": "I like the new layout",
"date": "2021-08-30T18:15:22Z",
"channel": "Snowflake",
"sentiment": "predict",
"fields": [
{
"field": "product",
"value": "Skills"
},
{
"field": "segment",
"value": "EMEA"
}
]
}
]
I'm getting stuck with mapping the keys of the columns to the values in the first dict and mapping the column and row to new keys in the final dict. I've tried various options using df.groupby
with .apply()
but am coming up short.
Samples of what I've tried:
df.groupby(['text', 'date','channel','sentiment','product','segment']).apply(
lambda r: r[['27cf2f]].to_dict(orient='records')).unstack('text').apply(lambda s: [
{s.index.name: idx, 'fields': value}
for idx, value in s.items()]
).to_json(orient='records')
Any and all help is appreciated!
Upvotes: 0
Views: 480
Reputation: 45
Solved with this:
# Specify field column names
fieldcols = ['product','segment']
# Build a dict for each group as a Series named `fields`
res = (df.groupby(['text', 'date','channel','sentiment'])
.apply(lambda s: [{'field': field,
'value': value}
for field in fieldcols
for value in s[field].values])
).rename('fields')
# Convert Series to DataFrame and then to_json
res = res.reset_index().to_json(orient='records', date_format='iso')
Output:
[
{
"text": "I like the new layout",
"date": "2021-08-30T18:15:22Z",
"channel": "Snowflake",
"sentiment": "predict",
"fields": [
{
"field": "product",
"value": "Skills"
},
{
"field": "segment",
"value": "EMEA"
}
]
}
]
Upvotes: 1