Reputation: 127
I have a DataFrame like:
Number | Metadata |
---|---|
12 | {key1: x1, key2: y1} |
12 | {key1: x2, key2: y2} |
12 | {key1: x3, key2: y3} |
13 | {key1: x4, key2: y4} |
13 | {key1: x5, key2: y5} |
How can I end up with the following?
Number | Metadata |
---|---|
12 | {key1: [x1, x2, x3], key2: [y1, y2, y3]} |
13 | {key1: [x4, x5], key2: [y4, y5]} |
I tried something like:
from collections import defaultdict
dd = defaultdict(list)
for d in (d1, d2): # you can list as many input dicts as you want here
for key, value in d.items():
But not sure how to group by the Number column in the process
Upvotes: 1
Views: 511
Reputation: 1305
try this :
>>>
def merge_dicts(g: pd.DataFrame):
dd = {}
for d in g.Metadata:
for k, v in d.items():
dd[k] = dd.get(k, []) + [v]
return dd
grouped = df.groupby('Number')
out = grouped.apply(merge_dicts).reset_index(name='Metadata')
print(out)
Number Metadata
0 12 {'key1': ['x1', 'x2', 'x3'], 'key2': ['y1', 'y2', 'y3']}
1 13 {'key1': ['x4', 'x5'], 'key2': ['y4', 'y5']}
Upvotes: 1