fatoon
fatoon

Reputation: 23

Python list group by key

I have a list :

x = [[17, 2], [18, 4], [17, 2], [18, 0], [19, 4],
     [17, 4], [19, 4], [17, 4], [20, 4], [17, 4],
     [20, 4], [17, 4], [17, 4], [18, 4], [17, 4]]

I'd like to sum all second value when the first is the same.

Ex : 17 = 28...

I try to make a dict with :

d = {}
for row in x:
    if row[0] not in d:
        d[row[0]] = []
    d[row[0]].append(row[1])

The result is

{17: [2, 2, 4, 4, 4, 4, 4, 4],
 18: [4, 0, 4], 19: [4, 4],
 20: [4, 4]}

I didn't find a way to sum the values.

Upvotes: 1

Views: 979

Answers (3)

Marcel Preda
Marcel Preda

Reputation: 1205

Here we are:

x=[[17, 2], [18, 4], [17, 2], [18, 0], [19, 4], [17, 4], [19, 4], [17, 4], [20, 4], [17, 4], [20, 4], [17, 4], [17, 4], [18, 4], [17, 4]]
d = {}
for row in x:
    if row[0] not in d:
        d[row[0]] = 0
    d[row[0]] += row[1]

print(d)

and the output is

{17: 28, 18: 8, 19: 8, 20: 8}

Upvotes: 0

Mad Physicist
Mad Physicist

Reputation: 114300

You can use itertools.groupby if the list is sorted (and you can use sorted to ensure that):

from itertools import groupby
from operator import itemgetter

d = {key: sum(grp) for key, grp in groupby(sorted(x, key=itemgetter(0)))}

In this case itemgetter(0) is a more efficient shortcut for lambda x: x[0].

In your original case, you could either maintain a running sum or sum afterwards. To sum the dictionary you already have:

d = {k: sum(v) for k, v in d.items()}

To maintain a running sum:

d = {}
for k, v in x:
    if k in d:
        d[k] += v
    else:
        d[k] = v

A shorter way of doing the same thing would be to use dict.setdefault:

d = {}
for k, v in x:
    d[k] = d.setdefault(k, 0) + v

Upvotes: 2

BioGeek
BioGeek

Reputation: 22827

Using defaultdict:

>>> from collections import defaultdict
>>> d = defaultdict(int)
>>> for lst in x:
...   a, b = lst
...   d[a] +=  b
>>> d
defaultdict(int, {17: 28, 18: 8, 19: 8, 20: 8})

Upvotes: 1

Related Questions