43zombiegit
43zombiegit

Reputation: 127

How to merge dictionaries as columns in a DataFrame, collecting values from matching keys grouped

I have a DataFrame like:

Number Metadata
12 {key1: x1, key2: y1}
12 {key1: x2, key2: y2}
12 {key1: x3, key2: y3}
13 {key1: x4, key2: y4}
13 {key1: x5, key2: y5}

How can I end up with the following?

Number Metadata
12 {key1: [x1, x2, x3], key2: [y1, y2, y3]}
13 {key1: [x4, x5], key2: [y4, y5]}

I tried something like:

from collections import defaultdict
    dd = defaultdict(list)
    for d in (d1, d2): # you can list as many input dicts as you want here
        for key, value in d.items():

But not sure how to group by the Number column in the process

Upvotes: 1

Views: 511

Answers (1)

ziying35
ziying35

Reputation: 1305

try this :

>>>
def merge_dicts(g: pd.DataFrame):
    dd = {}
    for d in g.Metadata:
        for k, v in d.items():
            dd[k] = dd.get(k, []) + [v]
    return dd


grouped = df.groupby('Number')
out = grouped.apply(merge_dicts).reset_index(name='Metadata')
print(out)

    Number  Metadata
0   12      {'key1': ['x1', 'x2', 'x3'], 'key2': ['y1', 'y2', 'y3']}
1   13      {'key1': ['x4', 'x5'], 'key2': ['y4', 'y5']}

Upvotes: 1

Related Questions