user15923317
user15923317

Reputation:

Merge multiple dictionaries in array and sum values

I have a dictionary that looks something like this:

d = [
    {
        "name": "John",
        "surname": "Budd",
        "netWorth": "100000",
        "salary": "4700",
        "comment": "Cool"
    },
    {
        "name": "Tedd",
        "surname": "Walker",
        "netWorth": "400000",
        "salary": "8000",
        "comment": "Nice"
    },
    {
        "name": "John",
        "surname": "Budd",
        "netWorth": "300000",
        "salary": "5000",
        "comment": "Pretty"
    }
]

I would like to sum netWorth and salary values when name and surname of dictionaries matches and do it with all items in the array of dictionaries.

Thing is that, the comment field is different and needs to be removed. Is there any library out there to simplify this task?

Expected result after data manipulation:

d = [
    {
        "name": "John",
        "surname": "Budd",
        "netWorth": "400000",
        "salary": "9700"
    },
    {
        "name": "Tedd",
        "surname": "Walker",
        "netWorth": "400000",
        "salary": "8000"
    }
]

Upvotes: 0

Views: 95

Answers (2)

Alain T.
Alain T.

Reputation: 42143

You can use map to access multiple keys with a dictionary's .get() method. This can be leveraged to build a dictionary of totals for each pair of name/surname. Then convert the counting dictionary back into a list of dictionaries:

names  = ('name','surname')       # keys to map for name/surname pairs
values = ('netWorth','salary')    # keys to map for totals

totals = dict()                   # totals per name/surname pairs
for p in d:                       # go through dictionaries
    for i,v in enumerate(values): # add multiple keys for name/surname pairs
        totals.setdefault((*map(p.get,names),),[0,0])[i] += int(p[v])

# convert back to a list of dictionaries
totals = [{k:w for k,w in zip(names+values,(*n,*v))} 
          for n,v in totals.items()]

print(totals)
[{'name': 'John', 'surname': 'Budd',   'netWorth': 400000, 'salary': 9700}, 
 {'name': 'Tedd', 'surname': 'Walker', 'netWorth': 400000, 'salary': 8000}]

Upvotes: 0

gboffi
gboffi

Reputation: 25023

dict is a builtin, it`s bad taste to use its name as an identifier

In [8]: d = [
   ...:     {
   ...:         "name": "John",
   ...:         "surname": "Budd",
   ...:         "netWorth": "100000",
   ...:         "salary": "4700",
   ...:         "comment": "Cool"
   ...:     },
   ...:     {
   ...:         "name": "Tedd",
   ...:         "surname": "Walker",
   ...:         "netWorth": "400000",
   ...:         "salary": "8000",
   ...:         "comment": "Nice"
   ...:     },
   ...:     {
   ...:         "name": "John",
   ...:         "surname": "Budd",
   ...:         "netWorth": "300000",
   ...:         "salary": "5000",
   ...:         "comment": "Pretty"
   ...:     }
   ...: ]

We can use the setdefault method of regular dictionaries to sum the salary and the net worth

In [9]: w = {}
   ...: s = {}
   ...: for person in d:
   ...:     p = person['name'], person['surname']
   ...:     w[p] = w.setdefault(p, 0) + int(person['netWorth'])
   ...:     s[p] = s.setdefault(p, 0) + int(person['salary'])
   ...: print(w)
   ...: print(s)
{('John', 'Budd'): 400000, ('Tedd', 'Walker'): 400000}
{('John', 'Budd'): 9700, ('Tedd', 'Walker'): 8000}

Upvotes: 2

Related Questions