Yafim Simanovsky
Yafim Simanovsky

Reputation: 636

nested lists combine values according to first value

list = [ ['a',14,2], ['b',10,1], ['a',3,12], ['r',5,5], ['r',6,13] ]
result = data_sum(list)

def data_sum(list):
  for set in list:
    current_index = list.index(set)
    string  = set[0]
    for item in list:
      second_index = list.index(each)
      if string == item[0] and current_index != second_index:
        set[0] = item[0]
        set[1] += item[1]
        set[2] += item[2]
        del each

  return list

My result should be

[ ['a',17,14], ['b',10,1], ['r',11,18] ]

where nested lists are aggregated according to the first string if it's identical.

I don't think I can use set[] here because it will not sum up according to the elements inside the nested list itself.

  1. I'm not sure I'm using the list.index correctly
  2. so far the output is the exact same original list

Upvotes: 1

Views: 1079

Answers (1)

RoadRunner
RoadRunner

Reputation: 26325

Firstly, list, set and string are already builtin functions, so using these names is not recommended. I also think your over complicating the problem slightly, since all you need to do is group the letters together, and do some summing of the values afterwards.

In order to make this problem easier for yourself, you need to somehow group the first values of each list, and take the sum of the values after that. One possible way is to group the first values with a collections.defaultdict, then sum the corresponding values afterwards:

from collections import defaultdict

lsts = [['a',14,2], ['b',10,1], ['a',3,12], ['r',5,5], ['r',6,13]]

groups = defaultdict(list)
for letter, first, second in lsts:
    groups[letter].append([first, second])
# defaultdict(<class 'list'>, {'a': [[14, 2], [3, 12]], 'b': [[10, 1]], 'r': [[5, 5], [6, 13]]})

result = []
for key, value in groups.items():
    sums = [sum(x) for x in zip(*value)]
    result.append([key] + sums)

print(result)

Which Outputs:

[['a', 17, 14], ['b', 10, 1], ['r', 11, 18]]

The resultant list can also be written with this list comprehension:

result = [[[key] + [sum(x) for x in zip(*value)]] for key, value in groups.items()]

Another way is to use itertools.groupby:

from itertools import groupby
from operator import itemgetter

grouped = [list(g) for _, g in groupby(sorted(lsts), key = itemgetter(0))]
# [[['a', 3, 12], ['a', 14, 2]], [['b', 10, 1]], [['r', 5, 5], ['r', 6, 13]]]

result = []
for group in grouped:
    numbers = [x[1:] for x in group]
    sums = [sum(x) for x in zip(*numbers)]
    result.append([[group[0][0]] + sums])
print(result)

Which also outputs:

[['a', 17, 14], ['b', 10, 1], ['r', 11, 18]]

Note: The second approach could also be written as a big list comprehension:

result = [[[group[0][0]] + [sum(x) for x in zip(*[x[1:] for x in group])]] for group in [list(g) for _, g in groupby(sorted(lsts), key = itemgetter(0))]]

But this is ugly and unreadable, and shouldn't be used.

Upvotes: 3

Related Questions